SlideShare a Scribd company logo
Data Modelling for MongoDB
Norberto Leite
MongoDB
May 14th, 2019
Tel Aviv, Israel
Norberto Leite
Lead Engineer - Curriculum @ MongoDB
norberto@mongodb.com
New York
@nleite
https://university.mongodb.com
Goals of the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Summarize the steps
of a methodology
when modelling for
MongoDB
Recognize the need
and when to apply
Schema Design
Patterns
Goals of the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Summarize the steps
of a methodology
when modelling for
MongoDB
Recognize the need
and when to apply
Schema Design
Patterns
Goals of the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Summarize the steps
of a methodology
when modelling for
MongoDB
Recognize the need
and when to apply
Schema Design
Patterns
Differences when Modelling for
a Document Database versus a
Relational Database
Data Modeling for MongoDB
Thinking in Documents
1. Polymorphism
• different documents may contain
different fields
2. Array
• represent a "one-to-many" relation
• index is on all entries
3. Sub Document
• grouping some fields together
4. JSON/BSON
• documents are often shown as JSON
• BSON is the physical format
Example: modelling a blog
… 5 tabes become 1 or 2 collections
Example: Modelling a Social Network
Tabular MongoDB
Steps to create the model 1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema 3rd normal form
One solution
many solutions possible
Final schema likely denormalized few changes
Schema evolution difficult and not optimal
Likely downtime
easy and no downtime
Performance mediocre optimized
Differences: Relational/Tabular vs Document
Other Considerations for the Model
1. one-to-many relationships where "many" is a humongous number
2. Embed or Reference
• Joins via $lookup
• Transactions for multi document writes
3. Transactions available for Replica set, and soon for Sharded Clusters
4. Sharding Key
5. Indexes
6. Simple queries, or more complex ones with the Aggregation Framework
Flexible Modelling Methodology for
MongoDB
Data Modeling for MongoDB
Methodology
Methodology
1. Describe the
Workload
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
Data Modeling for MongoDB
Data Modeling for MongoDB
Data Modeling for MongoDB
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
Flexible Methodology
Case Study: ‫א‬‫ס‬‫פ‬‫ר‬‫ס‬‫ו‬‫א‬‫ר‬‫ו‬‫מ‬‫ט‬‫י‬
A. Business: coffee shop franchises
B. Name: Cuppa Coffee
also considered: Coffee Mate, Crocodile Coffee
C. Objective:
• 10 000 stores in Israel, Kazakhstan, Romania, Ukraine ...
• … then we invade America
D. Keys to success:
• Best coffee in the world
• Technology
Make the Best Coffee in the World
23g of ground coffee in, 20g of extracted
coffee out, in approximately 20 seconds
1. Fill a small or regular cup with 80% hot
water (not boiling but pretty hot). Your
cup should be 150ml to 200ml in total
volume, 80% of which will be hot water.
2. Grind 23g of coffee into your portafilter
using the double basket. We use a scale
that you can get here.
3. Draw 20g of coffee over the hot water by
placing your cup on a scale, press tare
and extract your shot.
Technology
1. Measure inventory in real time
• Shelves with scales
2. Big Data collection on cups of coffee
• weighings, temperature, time to produce, …
3. Data Analysis
• Coffee perfection
• Rush hours -> staffing needs
4. MongoDB
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are added or
removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the
next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a coffee
cup
5. Analysis of cups of coffee read Analytics
6. Technical Support read Helping our franchisees
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
1 – Workload: quantify/qualify
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
1 – Workload: quantify/qualify
Disk Space
Cups of coffee (one year of data)
• 10000 x 1000/day x 365
• 3.7 billions/year
• 370 GB (100 bytes/cup of coffee)
Weighings
• 10000 x 10/day x 365
• 365 billions/year
• 3.7 GB (100 bytes/weighings)
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
2 - Relations are still important
Type of Relation -> one-to-one/1-1 one-to-many/1-N many-to-many/N-N
Document embedded in
the parent document
• one read
• no joins
• one read
• no joins
• one read
• no joins
• duplication of
information
Document referenced in
the parent document
• smaller reads
• many reads
• smaller reads
• many reads
• smaller reads
• many reads
2 - Entities for ‫א‬‫ס‬‫פ‬‫ר‬‫ס‬‫ו‬‫א‬‫ר‬‫ו‬‫מ‬‫ט‬‫י‬
- Coffee cups
- Stores
- Coffee
machines
- Shelves
- Weighings
- Coffee bags
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
Schema Design Patterns
Schema Design Patterns
Resources
A. Advanced Schema Design
Patterns
• MongoDB World 2017
• Webinar
B. MongoDB University
• university.mongodb.com
• M320 – Data Modeling (2019)
C. Blogs on Schema Design
Patterns
https://www.mongodb.com/blog/post/building-with-patterns-a-summary
Schema Versioning
Computed Pattern
Subset Pattern
Subset Pattern
Bucket Pattern
Bucket Pattern
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02"),
"temp": [ [ 20.0, 20.1, 20.2, ... ],
[ 22.1, 22.1, 22.0, ... ],
...
]
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-03"),
"temp": [ [ 20.1, 20.2, 20.3, ... ],
[ 22.4, 22.4, 22.3, ... ],
...
]
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02T13"),
"temp": { 1: 20.0, 2: 20.1, 3: 20.2, ... }
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02T14"),
"temp": { 1: 22.1, 2: 22.1, 3: 22.0, ... }
}
Bucket per
Day
Bucket per
Hour
External Reference Pattern
Cuppa Coffee - Solution with
Patterns
• Schema Versioning
• Subset
• Computed
• Bucket
• External Reference
Conclusion
Takeaways from the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Takeaways from the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Summarize the steps
of a methodology
when modelling for
MongoDB
• Workload
• Relationships
• Patterns
Takeaways from the Presentation
Recognize the
differences when
modelling for a
Document Database
versus a Relational
Database
Summarize the steps
of a methodology
when modelling for
MongoDB
• Workload
• Relationships
• Patterns
Recognize the need
and when to apply
Schema Design
Patterns
Coming Soon …
• "Data Modelling" course at:
university.mongodb.com
Norberto Leite
Lead Engineer
norberto@mongodb.com
@nleite
Data Modeling for MongoDB

More Related Content

What's hot (20)

PPTX
MongoDB presentation
Hyphen Call
 
PPTX
Introduction to MongoDB
MongoDB
 
PPTX
Introduction to MongoDB
NodeXperts
 
PDF
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB
 
PPT
Introduction to mongodb
neela madheswari
 
ODP
Product catalog using MongoDB
Vishwas Bhagath
 
PDF
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
PDF
An introduction to MongoDB
César Trigo
 
PPT
Introduction to MongoDB
Ravi Teja
 
PDF
MongoDB Sharding Fundamentals
Antonios Giannopoulos
 
PPTX
An Introduction To NoSQL & MongoDB
Lee Theobald
 
PPTX
Introduction to NoSQL
PolarSeven Pty Ltd
 
PDF
MongoDB Fundamentals
MongoDB
 
PPTX
Mongo db intro.pptx
JWORKS powered by Ordina
 
PPTX
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Simplilearn
 
ODP
Introduction to MongoDB
Dineesha Suraweera
 
PPTX
Basics of MongoDB
HabileLabs
 
PPTX
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 
MongoDB presentation
Hyphen Call
 
Introduction to MongoDB
MongoDB
 
Introduction to MongoDB
NodeXperts
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB
 
Introduction to mongodb
neela madheswari
 
Product catalog using MongoDB
Vishwas Bhagath
 
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
An introduction to MongoDB
César Trigo
 
Introduction to MongoDB
Ravi Teja
 
MongoDB Sharding Fundamentals
Antonios Giannopoulos
 
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Introduction to NoSQL
PolarSeven Pty Ltd
 
MongoDB Fundamentals
MongoDB
 
Mongo db intro.pptx
JWORKS powered by Ordina
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Simplilearn
 
Introduction to MongoDB
Dineesha Suraweera
 
Basics of MongoDB
HabileLabs
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 

Similar to Data Modeling for MongoDB (20)

PDF
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Norberto Leite
 
PPTX
MongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB
 
PDF
MongoDB World 2019 - A Complete Methodology to Data Modeling for MongoDB
Daniel Coupal
 
PDF
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
Lisa Roth, PMP
 
PDF
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
Data_Modeling_MongoDB.pdf
jill734733
 
PPT
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
PDF
Mongo db data-models guide
Deysi Gmarra
 
PDF
Mongo db data-models-guide
Dan Llimpe
 
PDF
Semi Formal Model for Document Oriented Databases
Daniel Coupal
 
DOCX
What are the major components of MongoDB and the major tools used in it.docx
Technogeeks
 
PPTX
Super spike
Michael Falanga
 
PDF
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
PPTX
An Enterprise Architect's View of MongoDB
MongoDB
 
PPTX
Webinar: Best Practices for Getting Started with MongoDB
MongoDB
 
PPTX
MongoDB Best Practices
Lewis Lin 🦊
 
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Norberto Leite
 
MongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB
 
MongoDB World 2019 - A Complete Methodology to Data Modeling for MongoDB
Daniel Coupal
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
Lisa Roth, PMP
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
Data_Modeling_MongoDB.pdf
jill734733
 
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
Mongo db data-models guide
Deysi Gmarra
 
Mongo db data-models-guide
Dan Llimpe
 
Semi Formal Model for Document Oriented Databases
Daniel Coupal
 
What are the major components of MongoDB and the major tools used in it.docx
Technogeeks
 
Super spike
Michael Falanga
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
An Enterprise Architect's View of MongoDB
MongoDB
 
Webinar: Best Practices for Getting Started with MongoDB
MongoDB
 
MongoDB Best Practices
Lewis Lin 🦊
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
PDF
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB
 
PDF
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB
 
Ad

Recently uploaded (20)

PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 

Data Modeling for MongoDB

  • 1. Data Modelling for MongoDB Norberto Leite MongoDB May 14th, 2019 Tel Aviv, Israel
  • 2. Norberto Leite Lead Engineer - Curriculum @ MongoDB [email protected] New York @nleite
  • 4. Goals of the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database Summarize the steps of a methodology when modelling for MongoDB Recognize the need and when to apply Schema Design Patterns
  • 5. Goals of the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database Summarize the steps of a methodology when modelling for MongoDB Recognize the need and when to apply Schema Design Patterns
  • 6. Goals of the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database Summarize the steps of a methodology when modelling for MongoDB Recognize the need and when to apply Schema Design Patterns
  • 7. Differences when Modelling for a Document Database versus a Relational Database
  • 9. Thinking in Documents 1. Polymorphism • different documents may contain different fields 2. Array • represent a "one-to-many" relation • index is on all entries 3. Sub Document • grouping some fields together 4. JSON/BSON • documents are often shown as JSON • BSON is the physical format
  • 11. … 5 tabes become 1 or 2 collections
  • 12. Example: Modelling a Social Network
  • 13. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Initial schema 3rd normal form One solution many solutions possible Final schema likely denormalized few changes Schema evolution difficult and not optimal Likely downtime easy and no downtime Performance mediocre optimized Differences: Relational/Tabular vs Document
  • 14. Other Considerations for the Model 1. one-to-many relationships where "many" is a humongous number 2. Embed or Reference • Joins via $lookup • Transactions for multi document writes 3. Transactions available for Replica set, and soon for Sharded Clusters 4. Sharding Key 5. Indexes 6. Simple queries, or more complex ones with the Aggregation Framework
  • 19. Methodology 1. Describe the Workload 2. Identify and Model the Relationships
  • 23. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 25. Case Study: ‫א‬‫ס‬‫פ‬‫ר‬‫ס‬‫ו‬‫א‬‫ר‬‫ו‬‫מ‬‫ט‬‫י‬ A. Business: coffee shop franchises B. Name: Cuppa Coffee also considered: Coffee Mate, Crocodile Coffee C. Objective: • 10 000 stores in Israel, Kazakhstan, Romania, Ukraine ... • … then we invade America D. Keys to success: • Best coffee in the world • Technology
  • 26. Make the Best Coffee in the World 23g of ground coffee in, 20g of extracted coffee out, in approximately 20 seconds 1. Fill a small or regular cup with 80% hot water (not boiling but pretty hot). Your cup should be 150ml to 200ml in total volume, 80% of which will be hot water. 2. Grind 23g of coffee into your portafilter using the double basket. We use a scale that you can get here. 3. Draw 20g of coffee over the hot water by placing your cup on a scale, press tare and extract your shot.
  • 27. Technology 1. Measure inventory in real time • Shelves with scales 2. Big Data collection on cups of coffee • weighings, temperature, time to produce, … 3. Data Analysis • Coffee perfection • Rush hours -> staffing needs 4. MongoDB
  • 28. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 29. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days 3. Anomalies in the inventory read Analytics 4. Making a cup of coffee write A coffee machine reporting on the production of a coffee cup 5. Analysis of cups of coffee read Analytics 6. Technical Support read Helping our franchisees
  • 30. Query Quantification Qualification 1. Coffee weight on the shelves 10/day*shelf*store => 1/sec <1s critical write 2. Coffee to deliver to stores 1/day*store => 0.1/sec <60s 3. Anomalies in the inventory 24 reads/day <5mins "collection scan" 4. Making a cup of coffee 10 000 000 writes/day 115 writes/sec <100ms non-critical write … cups of coffee at rush hour 3 000 000 writes/hr 833 writes/sec <100ms non-critical write 5. Analysis of cups of coffee 24 reads/day stale data is fine "collection scan" 6. Technical Support 1000 reads/day <1s 1 – Workload: quantify/qualify
  • 31. Query Quantification Qualification 1. Coffee weight on the shelves 10/day*shelf*store => 1/sec <1s critical write 2. Coffee to deliver to stores 1/day*store => 0.1/sec <60s 3. Anomalies in the inventory 24 reads/day <5mins "collection scan" 4. Making a cup of coffee 10 000 000 writes/day 115 writes/sec <100ms non-critical write … cups of coffee at rush hour 3 000 000 writes/hr 833 writes/sec <100ms non-critical write 5. Analysis of cups of coffee 24 reads/day stale data is fine "collection scan" 6. Technical Support 1000 reads/day <1s 1 – Workload: quantify/qualify
  • 32. Disk Space Cups of coffee (one year of data) • 10000 x 1000/day x 365 • 3.7 billions/year • 370 GB (100 bytes/cup of coffee) Weighings • 10000 x 10/day x 365 • 365 billions/year • 3.7 GB (100 bytes/weighings)
  • 33. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 34. 2 - Relations are still important Type of Relation -> one-to-one/1-1 one-to-many/1-N many-to-many/N-N Document embedded in the parent document • one read • no joins • one read • no joins • one read • no joins • duplication of information Document referenced in the parent document • smaller reads • many reads • smaller reads • many reads • smaller reads • many reads
  • 35. 2 - Entities for ‫א‬‫ס‬‫פ‬‫ר‬‫ס‬‫ו‬‫א‬‫ר‬‫ו‬‫מ‬‫ט‬‫י‬ - Coffee cups - Stores - Coffee machines - Shelves - Weighings - Coffee bags
  • 36. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 38. Schema Design Patterns Resources A. Advanced Schema Design Patterns • MongoDB World 2017 • Webinar B. MongoDB University • university.mongodb.com • M320 – Data Modeling (2019) C. Blogs on Schema Design Patterns https://www.mongodb.com/blog/post/building-with-patterns-a-summary
  • 44. Bucket Pattern { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02"), "temp": [ [ 20.0, 20.1, 20.2, ... ], [ 22.1, 22.1, 22.0, ... ], ... ] } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-03"), "temp": [ [ 20.1, 20.2, 20.3, ... ], [ 22.4, 22.4, 22.3, ... ], ... ] } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02T13"), "temp": { 1: 20.0, 2: 20.1, 3: 20.2, ... } } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02T14"), "temp": { 1: 22.1, 2: 22.1, 3: 22.0, ... } } Bucket per Day Bucket per Hour
  • 46. Cuppa Coffee - Solution with Patterns • Schema Versioning • Subset • Computed • Bucket • External Reference
  • 48. Takeaways from the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database
  • 49. Takeaways from the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database Summarize the steps of a methodology when modelling for MongoDB • Workload • Relationships • Patterns
  • 50. Takeaways from the Presentation Recognize the differences when modelling for a Document Database versus a Relational Database Summarize the steps of a methodology when modelling for MongoDB • Workload • Relationships • Patterns Recognize the need and when to apply Schema Design Patterns
  • 51. Coming Soon … • "Data Modelling" course at: university.mongodb.com