SlideShare a Scribd company logo
#mongodbdays




       Schema Design
       Emily Stolfo
       Ruby Engineer/Evangelist, 10gen
       @EmStolfo




Tuesday, January 29, 13
Agenda
       • Working with documents
       • Common patterns
       • Evolving a Schema
       • Queries and Indexes




Tuesday, January 29, 13
RDBMS              MongoDB
       Database           ➜ Database
       Table              ➜ Collection
       Row                ➜ Document
       Index              ➜ Index
       Join               ➜ Embedded Document
       Foreign Key        ➜ Reference

       Terminology

Tuesday, January 29, 13
Working with Documents




Tuesday, January 29, 13
Documents
       Provide flexibility and
       performance



Tuesday, January 29, 13
Example Schema (MongoDB)

Tuesday, January 29, 13
Embedding




       Example Schema (MongoDB)

Tuesday, January 29, 13
Embedding




                          Linking




       Example Schema (MongoDB)

Tuesday, January 29, 13
Relational Schema Design
       Focuses on data storage




Tuesday, January 29, 13
Document Schema Design
       Focuses on data use




Tuesday, January 29, 13
Schema Design Considerations
    • What is a priority?
           – High consistency
           – High read performance
           – High write performance

    • How does the application access and manipulate data?
           – Read/Write Ratio
           – Types of Queries / Updates
           – Data life-cycle and growth
           – Analytics (Map Reduce, Aggregation)




Tuesday, January 29, 13
Tools for Data Access
       • Flexible Schemas
       • Embedded data structures
       • Secondary Indexes
       • Multi-Key Indexes
       • Aggregation Framework
             – Pipeline operators: $project, $match, $limit,
               $skip, $sort, $group, $unwind

       • No Joins



Tuesday, January 29, 13
Data Manipulation
       • Conditional Query Operators
             – Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne
             – Vector: $in, $nin, $all, $size

       • Atomic Update Operators
             – Scalar: $inc, $set, $unset
             – Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSet




Tuesday, January 29, 13
Schema Design Example




Tuesday, January 29, 13
Library Management Application
       • Patrons
       • Books
       • Authors
       • Publishers




Tuesday, January 29, 13
One to One Relations
       example




Tuesday, January 29, 13
Modeling Patrons

       patron = {                   patron = {
         _id: "joe"                   _id: "joe"
         name: "Joe Bookreader”       name: "Joe Bookreader",
       }
                                      address: {
       address = {                       street: "123 Fake St. ",
         patron_id = "joe",              city: "Faketon",
         street: "123 Fake St. ",        state: "MA",
         city: "Faketon",                zip: 12345
         state: "MA",
         zip: 12345                   }
       }                            }




Tuesday, January 29, 13
One to One Relations
    • “Contains” relationships are often embedded.
    • Document provides a holistic representation of
        objects with embedded entities.
    • Optimized read performance.




Tuesday, January 29, 13
One To Many Relations
       examples




Tuesday, January 29, 13
Patrons with many addresses

       patron = {
         _id: "joe"
         name: "Joe Bookreader",
         join_date: ISODate("2011-10-15"),
         addresses: [
           {street: "1 Vernon St.", city: "Newton", state: "MA", …},
           {street: "52 Main St.", city: "Boston", state: "MA", …},
         ]
       }




Tuesday, January 29, 13
One to Many Relations
       example 2
       Publishers and Books



Tuesday, January 29, 13
Publishers and Books relation
       • Publishers put out many books
       • Books have one publisher




Tuesday, January 29, 13
Book Data


       MongoDB: The Definitive Guide,
       By Kristina Chodorow and Mike Dirolf
       Published: 9/24/2010
       Pages: 216
       Language: English

       Publisher: O’Reilly Media, CA




Tuesday, January 29, 13
Book Model with Embedded Publisher

       book = {
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English",
         publisher: {
             name: "O’Reilly Media",
             founded: "1980",
             location: "CA"
         }
       }




Tuesday, January 29, 13
Book Model with Normalized Publisher
       publisher = {
         name: "O’Reilly Media",
         founded: "1980",
         location: "CA"
       }

       book = {
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }




Tuesday, January 29, 13
Link with Publisher _id as a Reference
       publisher = {
         _id: "oreilly",
         name: "O’Reilly Media",
         founded: "1980",
         location: "CA"
       }

       book = {
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English",
         publisher_id: "oreilly"
       }




Tuesday, January 29, 13
Link with Book _ids as a Reference
       publisher = {
         name: "O’Reilly Media",
         founded: "1980",
         location: "CA"
         books: [ "123456789", ... ]
       }

       book = {
         _id: "123456789",
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }




Tuesday, January 29, 13
Where do you put the reference?

       • Reference to single publisher on books
             – Use when items have unbounded growth (unlimited # of
                 books)



       • Array of books in publisher document
             – Optimal when many means a handful of items
             – Use when there is a bound on potential growth




Tuesday, January 29, 13
One to Many Relations
       example 3
       Books and Patrons


Tuesday, January 29, 13
Books and Patrons
       • Book can be checked out by one Patron at a time
       • Patrons can check out many books (but not 1000s)




Tuesday, January 29, 13
Modeling Checkouts
       patron = {
         _id: "joe"
         name: "Joe Bookreader",
         join_date: ISODate("2011-10-15"),
         address: { ... }
       }

       book = {
         _id: "123456789"
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ],
         ...
       }




Tuesday, January 29, 13
Modeling Checkouts
         patron = {
           _id: "joe"
           name: "Joe Bookreader",
           join_date: ISODate("2011-10-15"),
           address: { ... },
           checked_out: [
               { _id: "123456789", checked_out: "2012-10-15" },
               { _id: "987654321", checked_out: "2012-09-12" },
               ...
           ]
         }




Tuesday, January 29, 13
De-normalization
       Provides data locality




Tuesday, January 29, 13
Modeling Checkouts - de-normalized
       patron = {
         _id: "joe"
         name: "Joe Bookreader",
         join_date: ISODate("2011-10-15"),
         address: { ... },
         checked_out: [
             { _id: "123456789",
                title: "MongoDB: The Definitive Guide",
                authors: [ "Kristina Chodorow", "Mike Dirolf" ],
                checked_out: ISODate("2012-10-15")
             },
             { _id: "987654321"
                title: "MongoDB: The Scaling Adventure", ...
             }, ...
         ]
       }




Tuesday, January 29, 13
Referencing vs. Embedding
      • Embedding is a bit like pre-joining data
      • Document level operations are easy for the server
         to handle
      • Embed when the “many” objects always appear
         with (viewed in the context of) their parents.
      • Reference when you need more flexibility


      How does your application access and manipulate data?


Tuesday, January 29, 13
Many to Many Relations
       example




Tuesday, January 29, 13
Books and Authors
       book = {
         title: "MongoDB: The Definitive Guide",
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "New York"
       }

       author = {
         _id: "mdirolf",
         name: "Mike Dirolf",
         hometown: "Albany"
       }




Tuesday, January 29, 13
Relation stored in Book document
       book = {
         title: "MongoDB: The Definitive Guide",
         authors : [
             { _id: "kchodorow", name: "Kristina Chodorow” },
             { _id: "mdirolf", name: "Mike Dirolf” }
         ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "New York"
       }

       author = {
         _id: "mdirolf",
         name: "Mike Dirolf",
         hometown: "Albany"
       }



Tuesday, January 29, 13
Relation stored in Author document

       book = {
         _id: 123456789
         title: "MongoDB: The Definitive Guide",
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "Cincinnati",
         books: [ {book_id: 123456789, title : "MongoDB: The Definitive
       Guide" }]
       }




Tuesday, January 29, 13
Relation stored in both documents

       book = {
         _id: 123456789
         title: "MongoDB: The Definitive Guide",
         authors = [ kchodorow, mdirolf ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "New York",
         books: [ 123456789, ... ]
       }

       author = {
         _id: "mdirolf",
         name: "Mike Dirolf",
         hometown: "Albany",
         books: [ 123456789, ... ]
       }


Tuesday, January 29, 13
Where do you put the reference?
     Think about common queries
       book = {
         title: "MongoDB: The Definitive Guide",
         authors : [
             { _id: "kchodorow", name: "Kristina Chodorow” },
             { _id: "mdirolf", name: "Mike Dirolf” }
         ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "New York"
       }

       db.books.find( { authors.name : "Kristina Chodorow" } )




Tuesday, January 29, 13
Where do you put the reference?
     Think about indexes
       book = {
         title: "MongoDB: The Definitive Guide",
         authors : [
             { _id: "kchodorow", name: "Kristina Chodorow” },
             { _id: "mdirolf", name: "Mike Dirolf” }
         ]
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       author = {
         _id: "kchodorow",
         name: "Kristina Chodorow",
         hometown: "New York"
       }

       db.books.createIndex( { authors.name : 1 } )




Tuesday, January 29, 13
Trees
       example




Tuesday, January 29, 13
Parent References

       book = {
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ],
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English",
         category: "MongoDB"
       }

       category = { _id: MongoDB, parent: Databases }
       category = { _id: Databases, parent: Programming }




Tuesday, January 29, 13
Child References

       book = {
         _id: 123456789,
         title: "MongoDB: The Definitive Guide",
         authors: [ "Kristina Chodorow", "Mike Dirolf" ],
         published_date: ISODate("2010-09-24"),
         pages: 216,
         language: "English"
       }

       category = { _id: MongoDB, children: [ 123456789, … ] }
       category = { _id: Databases, children: [ MongoDB, Postgres }
       category = { _id: Programming, children: [ Databases, Languages ] }




Tuesday, January 29, 13
Modeling Trees
       • Parent References
            - Each node is stored as a document
            - Contains the id of the parent
       • Child References
            - Each node contains ids of its children
            - Can support graphs (multiple parents / child)




Tuesday, January 29, 13
Array of Ancestors
        book = {
          title: "MongoDB: The Definitive Guide",
          authors: [ "Kristina Chodorow", "Mike Dirolf" ],
          published_date: ISODate("2010-09-24"),
          pages: 216,
          language: "English",
          categories: [ Programming, Databases, MongoDB ]
        }

        book = {
          title: "MySQL: The Definitive Guide",
          authors: [ "Michael Kofler" ],
          published_date: ISODate("2010-09-24"),
          pages: 216,
          language: "English",
          parent: "MySQL",
          ancestors: [ Programming, Databases, MySQL ]
        }




Tuesday, January 29, 13
Single Table Inheritance
       example




Tuesday, January 29, 13
Single Table Inheritance

            book = {
              title: "MongoDB: The Definitive Guide",
              authors: [ "Kristina Chodorow", "Mike Dirolf" ]
              published_date: ISODate("2010-09-24"),
              kind: loanable
              locations: [ ... ]
              pages: 216,
              language: "English",
              publisher: {
                  name: "O’Reilly Media",
                  founded: "1980",
                  location: "CA"
              }
            }




Tuesday, January 29, 13
Queues
       example




Tuesday, January 29, 13
Update highest priority request

       db.loans.insert({
            _id: 123456789,
            book_id: 987654321,
            pending: false,
            approved: false,
            priority: 3
       })



       //Find the highest priority request and mark as pending approval

       request = db.loans.findAndModify({
           query: { pending: false, book_id: 987654321 },
           sort: { priority: -1},
           update: { $set: { pending : true, started: new ISODate() } }
       })



Tuesday, January 29, 13
Summary
       • Schema design is different in MongoDB
       • Basic data design principals apply
       • Focus on how application accesses and
          manipulates data
       • Evolve schema to meet changing requirements


       • Application-level logic is important!




Tuesday, January 29, 13
#mongodbdays




       Thank You
       Emily Stolfo
       Ruby Engineer/Evangelist, 10gen
       @EmStolfo




Tuesday, January 29, 13

More Related Content

What's hot (6)

PPTX
Jumpstart: Schema Design
MongoDB
 
PDF
From text to entities: Information Extraction in the Era of Knowledge Graphs
GraphRM
 
PPTX
Chen li asterix db: 大数据处理开源平台
jins0618
 
PDF
JSON Learning
Khaled Md. Saifullah
 
PPTX
Java script and json
Islam Abdelzaher
 
PPT
JSON Referencing and Schema
kriszyp
 
Jumpstart: Schema Design
MongoDB
 
From text to entities: Information Extraction in the Era of Knowledge Graphs
GraphRM
 
Chen li asterix db: 大数据处理开源平台
jins0618
 
JSON Learning
Khaled Md. Saifullah
 
Java script and json
Islam Abdelzaher
 
JSON Referencing and Schema
kriszyp
 

Viewers also liked (7)

PDF
Mongo Berlin - Mastering the Shell
MongoDB
 
PDF
Modeling for Performance
MongoDB
 
PDF
Morning with MongoDB Paris 2012 - Making Big Data Small
MongoDB
 
PDF
Keeping data-safe-webinar-2010-11-01
MongoDB
 
KEY
2011 mongo sf-sharding
MongoDB
 
PPTX
Securing Data in MongoDB with Gazzang and Chef
MongoDB
 
PDF
Indexing and Query Optimizer (Richard Kreuter)
MongoDB
 
Mongo Berlin - Mastering the Shell
MongoDB
 
Modeling for Performance
MongoDB
 
Morning with MongoDB Paris 2012 - Making Big Data Small
MongoDB
 
Keeping data-safe-webinar-2010-11-01
MongoDB
 
2011 mongo sf-sharding
MongoDB
 
Securing Data in MongoDB with Gazzang and Chef
MongoDB
 
Indexing and Query Optimizer (Richard Kreuter)
MongoDB
 
Ad

Similar to Schema Design (20)

PDF
Schema Design
MongoDB
 
PPT
MongoDB Schema Design
MongoDB
 
PDF
Schema Design
MongoDB
 
PPTX
Dev Jumpstart: Schema Design Best Practices
MongoDB
 
PPTX
Webinar: Schema Design
MongoDB
 
PDF
MongoDB Schema Design
aaronheckmann
 
PPTX
Schema Design
MongoDB
 
PDF
Schema design
MongoDB
 
PDF
Schema Design
MongoDB
 
PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
PDF
Schema Design
MongoDB
 
PPTX
Schema Design
MongoDB
 
PPTX
MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consu...
MongoDB
 
KEY
Modeling Data in MongoDB
lehresman
 
PPTX
Modeling JSON data for NoSQL document databases
Ryan CrawCour
 
PPTX
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
PDF
Getting Started With MongoDB and Mongoose
Ynon Perek
 
PDF
Mongodb Intro
Ynon Perek
 
PDF
MongoDB and Schema Design
Matias Cascallares
 
PPTX
lecture_34e.pptx
janibashashaik25
 
Schema Design
MongoDB
 
MongoDB Schema Design
MongoDB
 
Schema Design
MongoDB
 
Dev Jumpstart: Schema Design Best Practices
MongoDB
 
Webinar: Schema Design
MongoDB
 
MongoDB Schema Design
aaronheckmann
 
Schema Design
MongoDB
 
Schema design
MongoDB
 
Schema Design
MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
Schema Design
MongoDB
 
Schema Design
MongoDB
 
MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consu...
MongoDB
 
Modeling Data in MongoDB
lehresman
 
Modeling JSON data for NoSQL document databases
Ryan CrawCour
 
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
Getting Started With MongoDB and Mongoose
Ynon Perek
 
Mongodb Intro
Ynon Perek
 
MongoDB and Schema Design
Matias Cascallares
 
lecture_34e.pptx
janibashashaik25
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Schema Design

  • 1. #mongodbdays Schema Design Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13
  • 2. Agenda • Working with documents • Common patterns • Evolving a Schema • Queries and Indexes Tuesday, January 29, 13
  • 3. RDBMS MongoDB Database ➜ Database Table ➜ Collection Row ➜ Document Index ➜ Index Join ➜ Embedded Document Foreign Key ➜ Reference Terminology Tuesday, January 29, 13
  • 5. Documents Provide flexibility and performance Tuesday, January 29, 13
  • 7. Embedding Example Schema (MongoDB) Tuesday, January 29, 13
  • 8. Embedding Linking Example Schema (MongoDB) Tuesday, January 29, 13
  • 9. Relational Schema Design Focuses on data storage Tuesday, January 29, 13
  • 10. Document Schema Design Focuses on data use Tuesday, January 29, 13
  • 11. Schema Design Considerations • What is a priority? – High consistency – High read performance – High write performance • How does the application access and manipulate data? – Read/Write Ratio – Types of Queries / Updates – Data life-cycle and growth – Analytics (Map Reduce, Aggregation) Tuesday, January 29, 13
  • 12. Tools for Data Access • Flexible Schemas • Embedded data structures • Secondary Indexes • Multi-Key Indexes • Aggregation Framework – Pipeline operators: $project, $match, $limit, $skip, $sort, $group, $unwind • No Joins Tuesday, January 29, 13
  • 13. Data Manipulation • Conditional Query Operators – Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne – Vector: $in, $nin, $all, $size • Atomic Update Operators – Scalar: $inc, $set, $unset – Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSet Tuesday, January 29, 13
  • 15. Library Management Application • Patrons • Books • Authors • Publishers Tuesday, January 29, 13
  • 16. One to One Relations example Tuesday, January 29, 13
  • 17. Modeling Patrons patron = { patron = { _id: "joe" _id: "joe" name: "Joe Bookreader” name: "Joe Bookreader", } address: { address = { street: "123 Fake St. ", patron_id = "joe", city: "Faketon", street: "123 Fake St. ", state: "MA", city: "Faketon", zip: 12345 state: "MA", zip: 12345 } } } Tuesday, January 29, 13
  • 18. One to One Relations • “Contains” relationships are often embedded. • Document provides a holistic representation of objects with embedded entities. • Optimized read performance. Tuesday, January 29, 13
  • 19. One To Many Relations examples Tuesday, January 29, 13
  • 20. Patrons with many addresses patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ] } Tuesday, January 29, 13
  • 21. One to Many Relations example 2 Publishers and Books Tuesday, January 29, 13
  • 22. Publishers and Books relation • Publishers put out many books • Books have one publisher Tuesday, January 29, 13
  • 23. Book Data MongoDB: The Definitive Guide, By Kristina Chodorow and Mike Dirolf Published: 9/24/2010 Pages: 216 Language: English Publisher: O’Reilly Media, CA Tuesday, January 29, 13
  • 24. Book Model with Embedded Publisher book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } } Tuesday, January 29, 13
  • 25. Book Model with Normalized Publisher publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" } book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } Tuesday, January 29, 13
  • 26. Link with Publisher _id as a Reference publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA" } book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly" } Tuesday, January 29, 13
  • 27. Link with Book _ids as a Reference publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ] } book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } Tuesday, January 29, 13
  • 28. Where do you put the reference? • Reference to single publisher on books – Use when items have unbounded growth (unlimited # of books) • Array of books in publisher document – Optimal when many means a handful of items – Use when there is a bound on potential growth Tuesday, January 29, 13
  • 29. One to Many Relations example 3 Books and Patrons Tuesday, January 29, 13
  • 30. Books and Patrons • Book can be checked out by one Patron at a time • Patrons can check out many books (but not 1000s) Tuesday, January 29, 13
  • 31. Modeling Checkouts patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... } } book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ... } Tuesday, January 29, 13
  • 32. Modeling Checkouts patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ] } Tuesday, January 29, 13
  • 33. De-normalization Provides data locality Tuesday, January 29, 13
  • 34. Modeling Checkouts - de-normalized patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ] } Tuesday, January 29, 13
  • 35. Referencing vs. Embedding • Embedding is a bit like pre-joining data • Document level operations are easy for the server to handle • Embed when the “many” objects always appear with (viewed in the context of) their parents. • Reference when you need more flexibility How does your application access and manipulate data? Tuesday, January 29, 13
  • 36. Many to Many Relations example Tuesday, January 29, 13
  • 37. Books and Authors book = { title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany" } Tuesday, January 29, 13
  • 38. Relation stored in Book document book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany" } Tuesday, January 29, 13
  • 39. Relation stored in Author document book = { _id: 123456789 title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {book_id: 123456789, title : "MongoDB: The Definitive Guide" }] } Tuesday, January 29, 13
  • 40. Relation stored in both documents book = { _id: 123456789 title: "MongoDB: The Definitive Guide", authors = [ kchodorow, mdirolf ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York", books: [ 123456789, ... ] } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany", books: [ 123456789, ... ] } Tuesday, January 29, 13
  • 41. Where do you put the reference? Think about common queries book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } db.books.find( { authors.name : "Kristina Chodorow" } ) Tuesday, January 29, 13
  • 42. Where do you put the reference? Think about indexes book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } db.books.createIndex( { authors.name : 1 } ) Tuesday, January 29, 13
  • 43. Trees example Tuesday, January 29, 13
  • 44. Parent References book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB" } category = { _id: MongoDB, parent: Databases } category = { _id: Databases, parent: Programming } Tuesday, January 29, 13
  • 45. Child References book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" } category = { _id: MongoDB, children: [ 123456789, … ] } category = { _id: Databases, children: [ MongoDB, Postgres } category = { _id: Programming, children: [ Databases, Languages ] } Tuesday, January 29, 13
  • 46. Modeling Trees • Parent References - Each node is stored as a document - Contains the id of the parent • Child References - Each node contains ids of its children - Can support graphs (multiple parents / child) Tuesday, January 29, 13
  • 47. Array of Ancestors book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", categories: [ Programming, Databases, MongoDB ] } book = { title: "MySQL: The Definitive Guide", authors: [ "Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: "MySQL", ancestors: [ Programming, Databases, MySQL ] } Tuesday, January 29, 13
  • 48. Single Table Inheritance example Tuesday, January 29, 13
  • 49. Single Table Inheritance book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), kind: loanable locations: [ ... ] pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } } Tuesday, January 29, 13
  • 50. Queues example Tuesday, January 29, 13
  • 51. Update highest priority request db.loans.insert({ _id: 123456789, book_id: 987654321, pending: false, approved: false, priority: 3 }) //Find the highest priority request and mark as pending approval request = db.loans.findAndModify({ query: { pending: false, book_id: 987654321 }, sort: { priority: -1}, update: { $set: { pending : true, started: new ISODate() } } }) Tuesday, January 29, 13
  • 52. Summary • Schema design is different in MongoDB • Basic data design principals apply • Focus on how application accesses and manipulates data • Evolve schema to meet changing requirements • Application-level logic is important! Tuesday, January 29, 13
  • 53. #mongodbdays Thank You Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13