SlideShare a Scribd company logo
Ger Hartnett
Director of Technical Services (EMEA), MongoDB @ghartnett #MongoDB
Tales from the Field
Part two: Fixing Sub-optimal Performance in
a Retail Application
Or:
●Cautionary Tales
●Don’t solve the wrong problems
●Bad schemas & shard keys hurt ops
too
●The main talk should take 30-35 minutes
●You can submit questions via the chat box
●We’ll answer as many as possible at the end
●We will send the slides and recording
tomorrow via email
●The final webinar in the series will take place
on Thursday 21rd April – 14:00 BST | 15:00
CEST
Before we start
●You work in operations
●You work in development
●You have a MongoDB system in production
●You have contacted MongoDB Technical
Services (support)
●You attended the last webinar (part1)
A quick poll - add a word to the
chat to let me know your
perspective
●We collect - observations about common
mistakes - to share the experience of many
●Names have been changed to protect the
(mostly) innocent
●No animals were harmed during the making
of this presentation (but maybe some DBAs
and engineers had light emotional scarring)
●While you might be new to MongoDB we
have deep experience that you can leverage
Stories
1. Discovering a DR flaw during a data
centre outage
2. Complex documents, memory and
an upgrade “surprise”
3. Wild success “uncovers” the wrong
shard key
The Stories (part two today)
Story #1: Quick Review
Story #1: Recovering from a
disaster
●Prospect in the process of signing up for a
subscription
●Called us late on Friday, data centre power
outage and 30+ (11 shards) servers down
●When they started bringing up the first
shard, the nodes crashed with data
corruption
●17TB of data, very little free disk space,
JOURNALLING DISABLED!
Recovering each shard
1.Start secondary
read only
2.Mount NFS
storage for repair
3.Repair former
primary node
4.Iterative rsync to
seed a secondary
Secondary
Primary
Secondary
Key takeaways for you
●If you are departing significantly from
standard config, check with us (i.e. if you
think journalling is a bad idea)
●Two DC in different buildings on different
flood plains, not in the path of the same
storm (i.e. secondaries in AWS)
●DR/backups are useless if you haven’t
tested them
Story #2: Complex documents,
memory and an upgrade
“surprise”
●Well established ecommerce site selling
diverse goods in 20+ countries
●After switching to wired tiger in production,
performance dropped, this is the opposite of
what they were expecting
{
_id: 375
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... >
inventory: 423
}
Product Catalog: Original
Schema
What’s good about this schema?
● Each document contains all the data about a given
product, across all languages/locales
● Very efficient way to retrieve the English, French,
German, etc. translations of a single product’s
information in one query
However……
That is not how the product data is
actually used
(except perhaps by translation staff)
db
db.catalog.update( { _id : 375 }, { $inc: { count: -1 } } )
db.catalog.find( { _id : 375 } , { en_US : true } );
db.catalog.find( { _id : 375 } , { fr_FR : true } );
db.catalog.find( { _id : 375 } , { de_DE : true } );
... and so forth for other locales ...
Dominant Query Patterns
Which means……
The Product Catalog’s data model
did not fit the way the data was
accessed.
Consequences
●WiredTiger reads/rewrites the whole document
●Each document contained ~20x more data than
any common use case needed
●MongoDB lets you request just a subset of a
document’s contents (using a projection), but…
o Typically whole document loaded into RAM
●There are other overheads (like readahead)
{ _id: 42,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
<READAHEAD OVERHEAD>
{ _id: 709,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
<READAHEAD OVERHEAD>
{ _id: 3600,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
Visualising the read problem
- Data in RED are loaded into RAM
and used.
- Data in BLUE take up memory but
are not required.
- Readahead padding in GREEN
makes things even more inefficient
More RAM? It’s not that simple
What did we recommend?
● Design for your use case, your most common query
pattern
o In this case: 99.99% of queries want the product
data for exactly one locale at a time
o Move the frequently changing fields to a new
collection
● Eliminate inefficiencies on the system
o Make reading from disk less wasteful, maximise I/O
capabilities by reducing readahead
{ _id: "375-en_US",
name : ..., description : ..., <etc...> }
{ _id: "375-en_GB",
name : ..., description : ..., <etc...> }
{ _id: "375-fr_FR",
name : ..., description : ..., <etc...> }
... and so on for other locales ...
db.inventory
{ _id: "375", count : NumberLong(1234), <etc...> }
Product Catalog: Eventual
Schema
Aftermath & lessons learned
●Faster updates
●Queries induced minimal overhead
●Greater than 20x distinct products fit in
memory at once
●Disk I/O utilization reduced
●UI latency decreased
Key Takeaways
●When doing a major version/storage-engine
upgrade, test in staging with some
proportion of production data/workload
●Sometimes putting everything into one
document is counter productive
Story #3: Quick Preview
2 More Shards….
Story #3: Wild success uncovers
the wrong shard key
●Started out as error “[Balancer] caught
exception … tag ranges not valid for: db.coll”
●11 shards, they had added 2 new shards to
keep up traffic - 400+ databases
●Lots of code changes ahead of the
Superbowl
●Spotted slow 300+s queries, decided to build
some indexes without telling us
●Production went down
Further Reading
Production notes
docs.mongodb.org/manual/administration/production-notes
Mtools
github.com/rueckstiess/mtools
Ger Hartnett
Director Technical Services (EMEA), MongoDB
@ghartnett #MongoDB
Questions?
●You can submit questions via the chat box
●We are recording and will send slides
tomorrow
●We will send the slides and recording
tomorrow via email
●Part 3: the next webinar will take place on
Thursday 21st April – 14:00 BST | 15:00
CEST
www.mongodb.com/webinars
Questions
Code GerHartnett gets 25% discount

More Related Content

What's hot (20)

PPTX
Webinar: Scaling MongoDB
MongoDB
 
PPTX
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
MongoDB
 
PPT
Everything You Need to Know About Sharding
MongoDB
 
PPTX
Back to Basics Webinar 6: Production Deployment
MongoDB
 
PPTX
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
MongoDB
 
PPTX
Tales from production with postgreSQL at scale
Soumya Ranjan Subudhi
 
PPTX
Performance Tuning and Optimization
MongoDB
 
PPTX
Mongo db multidc_webinar
MongoDB
 
PPTX
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB
 
PPT
Migrating to MongoDB: Best Practices
MongoDB
 
PPTX
Agility and Scalability with MongoDB
MongoDB
 
PDF
Development to Production with Sharded MongoDB Clusters
Severalnines
 
PPTX
Hardware Provisioning for MongoDB
MongoDB
 
PPTX
Sharding Methods for MongoDB
MongoDB
 
PPTX
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
MongoDB
 
PDF
Time series databases
Source Ministry
 
PPTX
Common Cluster Configuration Pitfalls
MongoDB
 
PPTX
Sizing MongoDB Clusters
MongoDB
 
PPTX
MongoDB Aggregation Performance
MongoDB
 
PDF
MongoDB Administration 101
MongoDB
 
Webinar: Scaling MongoDB
MongoDB
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
MongoDB
 
Everything You Need to Know About Sharding
MongoDB
 
Back to Basics Webinar 6: Production Deployment
MongoDB
 
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
MongoDB
 
Tales from production with postgreSQL at scale
Soumya Ranjan Subudhi
 
Performance Tuning and Optimization
MongoDB
 
Mongo db multidc_webinar
MongoDB
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB
 
Migrating to MongoDB: Best Practices
MongoDB
 
Agility and Scalability with MongoDB
MongoDB
 
Development to Production with Sharded MongoDB Clusters
Severalnines
 
Hardware Provisioning for MongoDB
MongoDB
 
Sharding Methods for MongoDB
MongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
MongoDB
 
Time series databases
Source Ministry
 
Common Cluster Configuration Pitfalls
MongoDB
 
Sizing MongoDB Clusters
MongoDB
 
MongoDB Aggregation Performance
MongoDB
 
MongoDB Administration 101
MongoDB
 

Similar to Webinar: Avoiding Sub-optimal Performance in your Retail Application (20)

PPTX
MongoDB Days UK: Tales from the Field
MongoDB
 
PPTX
Tales from the Field
MongoDB
 
PPTX
Tales from the Field
MongoDB
 
PPTX
Scaling MongoDB
MongoDB
 
PPTX
Retail referencearchitecture productcatalog
MongoDB
 
PPTX
MongoDB at Scale
MongoDB
 
PDF
Mongodb in-anger-boston-rb-2011
bostonrb
 
PPTX
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
KEY
Optimize drupal using mongo db
Vladimir Ilic
 
PPT
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
PPTX
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
PPTX
Webinar: Building Your First Application with MongoDB
MongoDB
 
PPTX
Creating a Single View: Data Design and Loading Strategies
MongoDB
 
PDF
MongoDB World 2019: MongoDB Cluster Design: From Redundancy to GDPR
MongoDB
 
PDF
An Introduction to Mongo DB
WeAreEsynergy
 
KEY
Scaling with MongoDB
MongoDB
 
PPTX
Prepare for Peak Holiday Season with MongoDB
MongoDB
 
PDF
MongoDB at Gilt Groupe
MongoDB
 
PPTX
Creating a Single View: Overview and Analysis
MongoDB
 
MongoDB Days UK: Tales from the Field
MongoDB
 
Tales from the Field
MongoDB
 
Tales from the Field
MongoDB
 
Scaling MongoDB
MongoDB
 
Retail referencearchitecture productcatalog
MongoDB
 
MongoDB at Scale
MongoDB
 
Mongodb in-anger-boston-rb-2011
bostonrb
 
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
Optimize drupal using mongo db
Vladimir Ilic
 
No SQL and MongoDB - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
Webinar: Building Your First Application with MongoDB
MongoDB
 
Creating a Single View: Data Design and Loading Strategies
MongoDB
 
MongoDB World 2019: MongoDB Cluster Design: From Redundancy to GDPR
MongoDB
 
An Introduction to Mongo DB
WeAreEsynergy
 
Scaling with MongoDB
MongoDB
 
Prepare for Peak Holiday Season with MongoDB
MongoDB
 
MongoDB at Gilt Groupe
MongoDB
 
Creating a Single View: Overview and Analysis
MongoDB
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
Ad

Recently uploaded (20)

PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 

Webinar: Avoiding Sub-optimal Performance in your Retail Application

  • 1. Ger Hartnett Director of Technical Services (EMEA), MongoDB @ghartnett #MongoDB Tales from the Field Part two: Fixing Sub-optimal Performance in a Retail Application
  • 2. Or: ●Cautionary Tales ●Don’t solve the wrong problems ●Bad schemas & shard keys hurt ops too
  • 3. ●The main talk should take 30-35 minutes ●You can submit questions via the chat box ●We’ll answer as many as possible at the end ●We will send the slides and recording tomorrow via email ●The final webinar in the series will take place on Thursday 21rd April – 14:00 BST | 15:00 CEST Before we start
  • 4. ●You work in operations ●You work in development ●You have a MongoDB system in production ●You have contacted MongoDB Technical Services (support) ●You attended the last webinar (part1) A quick poll - add a word to the chat to let me know your perspective
  • 5. ●We collect - observations about common mistakes - to share the experience of many ●Names have been changed to protect the (mostly) innocent ●No animals were harmed during the making of this presentation (but maybe some DBAs and engineers had light emotional scarring) ●While you might be new to MongoDB we have deep experience that you can leverage Stories
  • 6. 1. Discovering a DR flaw during a data centre outage 2. Complex documents, memory and an upgrade “surprise” 3. Wild success “uncovers” the wrong shard key The Stories (part two today)
  • 8. Story #1: Recovering from a disaster ●Prospect in the process of signing up for a subscription ●Called us late on Friday, data centre power outage and 30+ (11 shards) servers down ●When they started bringing up the first shard, the nodes crashed with data corruption ●17TB of data, very little free disk space, JOURNALLING DISABLED!
  • 9. Recovering each shard 1.Start secondary read only 2.Mount NFS storage for repair 3.Repair former primary node 4.Iterative rsync to seed a secondary Secondary Primary Secondary
  • 10. Key takeaways for you ●If you are departing significantly from standard config, check with us (i.e. if you think journalling is a bad idea) ●Two DC in different buildings on different flood plains, not in the path of the same storm (i.e. secondaries in AWS) ●DR/backups are useless if you haven’t tested them
  • 11. Story #2: Complex documents, memory and an upgrade “surprise” ●Well established ecommerce site selling diverse goods in 20+ countries ●After switching to wired tiger in production, performance dropped, this is the opposite of what they were expecting
  • 12. { _id: 375 en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > inventory: 423 } Product Catalog: Original Schema
  • 13. What’s good about this schema? ● Each document contains all the data about a given product, across all languages/locales ● Very efficient way to retrieve the English, French, German, etc. translations of a single product’s information in one query
  • 14. However…… That is not how the product data is actually used (except perhaps by translation staff)
  • 15. db db.catalog.update( { _id : 375 }, { $inc: { count: -1 } } ) db.catalog.find( { _id : 375 } , { en_US : true } ); db.catalog.find( { _id : 375 } , { fr_FR : true } ); db.catalog.find( { _id : 375 } , { de_DE : true } ); ... and so forth for other locales ... Dominant Query Patterns
  • 16. Which means…… The Product Catalog’s data model did not fit the way the data was accessed.
  • 17. Consequences ●WiredTiger reads/rewrites the whole document ●Each document contained ~20x more data than any common use case needed ●MongoDB lets you request just a subset of a document’s contents (using a projection), but… o Typically whole document loaded into RAM ●There are other overheads (like readahead)
  • 18. { _id: 42, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } <READAHEAD OVERHEAD> { _id: 709, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } <READAHEAD OVERHEAD> { _id: 3600, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } Visualising the read problem - Data in RED are loaded into RAM and used. - Data in BLUE take up memory but are not required. - Readahead padding in GREEN makes things even more inefficient
  • 19. More RAM? It’s not that simple
  • 20. What did we recommend? ● Design for your use case, your most common query pattern o In this case: 99.99% of queries want the product data for exactly one locale at a time o Move the frequently changing fields to a new collection ● Eliminate inefficiencies on the system o Make reading from disk less wasteful, maximise I/O capabilities by reducing readahead
  • 21. { _id: "375-en_US", name : ..., description : ..., <etc...> } { _id: "375-en_GB", name : ..., description : ..., <etc...> } { _id: "375-fr_FR", name : ..., description : ..., <etc...> } ... and so on for other locales ... db.inventory { _id: "375", count : NumberLong(1234), <etc...> } Product Catalog: Eventual Schema
  • 22. Aftermath & lessons learned ●Faster updates ●Queries induced minimal overhead ●Greater than 20x distinct products fit in memory at once ●Disk I/O utilization reduced ●UI latency decreased
  • 23. Key Takeaways ●When doing a major version/storage-engine upgrade, test in staging with some proportion of production data/workload ●Sometimes putting everything into one document is counter productive
  • 24. Story #3: Quick Preview 2 More Shards….
  • 25. Story #3: Wild success uncovers the wrong shard key ●Started out as error “[Balancer] caught exception … tag ranges not valid for: db.coll” ●11 shards, they had added 2 new shards to keep up traffic - 400+ databases ●Lots of code changes ahead of the Superbowl ●Spotted slow 300+s queries, decided to build some indexes without telling us ●Production went down
  • 27. Ger Hartnett Director Technical Services (EMEA), MongoDB @ghartnett #MongoDB Questions?
  • 28. ●You can submit questions via the chat box ●We are recording and will send slides tomorrow ●We will send the slides and recording tomorrow via email ●Part 3: the next webinar will take place on Thursday 21st April – 14:00 BST | 15:00 CEST www.mongodb.com/webinars Questions
  • 29. Code GerHartnett gets 25% discount

Editor's Notes

  • #2: Field/Trenches
  • #4: What do the rest of you do?
  • #5: What do the rest of you do?
  • #6: Some borrowed, some merged into a single narrative Some of the people that inspired them may well be here in this room today
  • #10: Bill's Bulk Updates randomly affected an ever larger data set. In order to cope with the database size, Bill added more shards. The cluster scaled linearly, as intended.
  • #20: Well, it might fix things, but it’s expensive and the real problem is the efficiency
  • #25: Bill's Bulk Updates randomly affected an ever larger data set. In order to cope with the database size, Bill added more shards. The cluster scaled linearly, as intended.
  • #28: Field/Trenches
  • #29: What do the rest of you do?