MongoDB:  An IntroductionChris WestinSoftware Engineer, 10gen© Copyright 2010 10gen Inc.
OutlineThe Whys of Non-Relational DatabasesVocabulary of the Non-Relational WorldMongoDB
Why did non-relational databases arise?Problems with relational databases in the web worldThe Whys of Non-Relational Databases
Problem - Schema EvolutionApplications are evolving all the timeApplications need new fieldsApplications need new indexesData is growing – sometimes very fastUsers need to be able to alter their schemas without making their data unavailableThe web world expects 24x7 serviceRDBMSs can have a hard time doing this
Problem – Write RatesReplication is a solution for high read loadsSooner or later, writing becomes a bottleneckSharding – partitioning a logical database across multiple database instancesJoins and aggregation become a problemDistributed transactions are too slow for the webManual management of shardsChoosing shard partitionsRebalancing shards
An introduction to terminology you’re going to be seeing a lotVocabulary of the Non-Relational World
Data ModelsA non-relational database’s data model determines the kinds of items it can contain and how they can be retrievedWhat can the system store, and what does it know about what it contains?The relational data model is about storing records made up of named, scalar-valued fields, as specified by a schema, or type definitionWhat kind of queries can you do?SQL is a manifestation of the kinds of queries that fall out of relational algebra
Non-Relational Data ModelsKey-value storesDocument storesColumn-oriented databasesGraph databases
Key-Value StoresA mapping from a key to a valueThe store doesn’t know anything about the the key or valueThe store doesn’t know anything about the insides of the valueOperationsSet, get, or delete a key-value pair
Document StoresThe store is a container for documentsDocuments are made up of named fieldsFields may or may not have type definitionse.g. XSDs for XML stores, vs. schema-less JSON storesCan create “secondary indexes”These provide the ability to query on any document field(s)Operations:Insert and delete documentsUpdate fields within documents
Column-Oriented StoresLike a relational store, but flipped around: all data for a column is kept togetherAn index provides a means to get a column value for a recordOperations:Get, insert, delete records; updating fieldsStreaming column data in and out of Hadoop
Graph DatabasesStores vertex-to-vertex edgesOperations:Getting and setting edgesSometimes possible to annotate vertices or edgesQuery languages support finding paths between vertices, subject to various constraints
Consistency ModelsRelational databases support transactionsCan only see committed changesCommit/abort span multiple changesRead-only transaction flavorsRead committed, repeatable read, etcClassic assumption: “I’m querying the one-and-only database”Scaling reads and writes introduce different problems
Replication - The 1st Breakdown of Consistency
Limitations of a Single MasterReplication can provide arbitrary read scalabilitySubject to coping with read-consistency issuesSooner or later, writing becomes a bottleneckPhysical limitations (seek time)Throughput of a single I/O subsystem
ShardingParitition the primary key space via hashingSet up a duplicate system for each shardThe write-rate limitation now applies to each shardJoins or aggregation across shards are problematicCan the data be re-sharded on a live system?Can shards be re-balanced on a live system?
Multi-Site OperationFailure of a single-master system’s masterA new master can be chosenBut what if there’s a network partition?Can the application continue in read-only mode?
DynamoNow a generic term for multi-master systemsWrites can occur to any nodeThe same record can be updated on different nodes by different clientsAll writes are replicated everywhere
Dynamo – the 2nd breakdown of consistencyCollisions can occurWho wins?A collision resolution strategy is requiredVector clockshttp://en.wikipedia.org/wiki/Vector_clockApplication access must be aware of this
The Commercial Landscape
Key Client Implementation ConcernsMonotonic readsCan my reads go back in time?Read-your-own-writesIf I issue a query immediately after an insert or update, will I see my changes?Uninterrupted writesAm I always guaranteed the ability to write?Conflict ResolutionDo I need to have a conflict resolution strategy?
Using a Single-Master SystemWhat does the intermediate agent or system do for…Monotonic reads?Read-your-own-writes?Uninterrupted writes?Conflict Resolution?
Using a Multi-Master SystemWhat does the intermediate agent or system do for…Monotonic reads?Read-your-own-writes?Uninterrupted writes?Conflict Resolution?
Where MongoDB fits in the non-relational worldMongoDB’s architecture and featuresSome real-world usersMongoDB
MongoDB is a Document StoreMongoDB stores JSON objects as BSON{ LastName: ‘Flintstone’, FirstName: ‘Fred’, …}Secondary Indexesdb.collection.ensureIndex({LastName : 1, FirstName : 1});Simple QBE-like query syntaxdb.collection.find({LastName : ‘Flintstone’});db.collection.find({LastName : { $gte : ‘Flintstone’});
MongoDB – Advanced QueriesGeo-spatial queriesCreate a geo indexFind points near a given point, sorted by radial distanceCan be planar or sphericalFind points within a certain radial distance, within a bounding box, or a polygonBuilt-in Map-ReduceThe caller provides map and reduce functions written in JavaScript
MongoDB is a Single-Master SystemA database is served by members of a “replica set”The system elects a primary (master)Failure of the master is detected, and a new master is electedApplication writes get an error if there is no quorum to elect a new masterReads continue to be fulfilled
MongoDB Replica Set
MongoDB Supports ShardingA collection can be shardedEach shard is served by its own replica setNew shards (each a replica set) can be added at any timeShard key ranges are automatically balanced
MongoDB – Sharded Deployment
MongoDB Storage ManagementData is kept in memory-mapped filesServers should have a lot of memoryFiles are allocated as neededDocuments in a collection are kept on a list using a geographical addressing schemeIndexes (B*-trees) point to documents using geographical addresses
MongoDB Server ManagementReplica set members are aware of each otherA majority of votes is required to elect a new primaryMembers can be assigned priorities to affect the electione.g., an “invisible” replica can be created with zero priority for backup purposes
MongoDB AccessDrivers are available in many languages10gen supportedC, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, ScalaCommunity supportedClojure, ColdFusion, F#, Go, Groovy, Lua, Rhttp://www.mongodb.org/display/DOCS/Overview+-+Writing+Drivers+and+Tools
MongoDB AvailabilitySourcehttps://github.com/mongodb/mongoServerLicense:  AGPLhttp://www.mongodb.org/downloadsDriversLicense:  Apachehttp://www.mongodb.org/display/DOCS/Drivers
MongoDB – Hosted Serviceshttp://www.mongodb.org/display/DOCS/Hosting+CenterMongoHQ, Mongo Machine, MongoLabRESTful access to collections
MongoDB SupportPaid Supporthttp://www.10gen.com/client-portal10gen Hosted MonitoringConsulting, trainingFree Supporthttp://groups.google.com/group/mongodb-userhttp://stackoverflow.com/questions/tagged/mongodb
MongoDB Usershttp://www.10gen.com/customershttp://www.10gen.com/presentationscraigslist: http://www.10gen.com/presentation/mongosf2011/craigslistbit.ly: http://blip.tv/mongodb/bit-ly-user-history-auto-sharded-3723147shutterfly: http://www.10gen.com/presentation/mongosv2010/shutterfly
MongoDB:  An Introduction - june-2011
Mini-demo/tutorialhttp://try.mongodb.org/

More Related Content

DOCX
Mongo db report
PPT
Introduction to mongodb
PPTX
MongoDB presentation
PPTX
Mongo db intro.pptx
ODP
Introduction to MongoDB
PPT
Introduction to MongoDB
PPTX
Mango Database - Web Development
PPTX
Mongodb introduction and_internal(simple)
Mongo db report
Introduction to mongodb
MongoDB presentation
Mongo db intro.pptx
Introduction to MongoDB
Introduction to MongoDB
Mango Database - Web Development
Mongodb introduction and_internal(simple)

What's hot (20)

PDF
An introduction to MongoDB
PPTX
An Introduction To NoSQL & MongoDB
PPTX
MongoDB
KEY
Mongodb intro
PPTX
Webinar: What's new in the .NET Driver
PDF
Mongodb tutorial at Easylearning Guru
PDF
Introduction to MongoDB
PPTX
Top 10 frameworks of node js
PPT
Introduction to mongoDB
PPTX
Basics of MongoDB
PPTX
PPTX
Introduction to MongoDB
PDF
Mongo db dhruba
PPTX
Mongo db
PPT
MongoDB Pros and Cons
PPTX
Mongodb basics and architecture
PDF
Intro to NoSQL and MongoDB
PDF
Non Relational Databases
PPTX
Mongo DB
An introduction to MongoDB
An Introduction To NoSQL & MongoDB
MongoDB
Mongodb intro
Webinar: What's new in the .NET Driver
Mongodb tutorial at Easylearning Guru
Introduction to MongoDB
Top 10 frameworks of node js
Introduction to mongoDB
Basics of MongoDB
Introduction to MongoDB
Mongo db dhruba
Mongo db
MongoDB Pros and Cons
Mongodb basics and architecture
Intro to NoSQL and MongoDB
Non Relational Databases
Mongo DB
Ad

Viewers also liked (6)

PPTX
Webinar: Introduction to MongoDB 3.0
PDF
Introduction to Mongodb
PPTX
SAP ASE 16 SP02 Performance Features
PDF
SpringPeople Introduction to MongoDB Administration
PDF
Mongo DB
PDF
Intro To MongoDB
Webinar: Introduction to MongoDB 3.0
Introduction to Mongodb
SAP ASE 16 SP02 Performance Features
SpringPeople Introduction to MongoDB Administration
Mongo DB
Intro To MongoDB
Ad

Similar to MongoDB: An Introduction - june-2011 (20)

PPTX
MongoDB: An Introduction - July 2011
ODP
Front Range PHP NoSQL Databases
PPTX
PPTX
No sq lv2
PPTX
CS 542 Parallel DBs, NoSQL, MapReduce
PPT
2010 mongo berlin-scaling
PPTX
Overview of MongoDB and Other Non-Relational Databases
PDF
Open source Technology
KEY
MongoDB vs Mysql. A devops point of view
PPTX
NoSQL Introduction, Theory, Implementations
PPT
No SQL Databases as modern database concepts
PDF
NOSQL in big data is the not only structure langua.pdf
PDF
NOSQL- Presentation on NoSQL
PPTX
MongoDB
PPT
NoSql Databases
PPT
05 No SQL Sudarshan.ppt
PPT
No SQL Databases.ppt
PPT
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
PPT
MongoDb - Details on the POC
PPTX
No SQL - MongoDB
MongoDB: An Introduction - July 2011
Front Range PHP NoSQL Databases
No sq lv2
CS 542 Parallel DBs, NoSQL, MapReduce
2010 mongo berlin-scaling
Overview of MongoDB and Other Non-Relational Databases
Open source Technology
MongoDB vs Mysql. A devops point of view
NoSQL Introduction, Theory, Implementations
No SQL Databases as modern database concepts
NOSQL in big data is the not only structure langua.pdf
NOSQL- Presentation on NoSQL
MongoDB
NoSql Databases
05 No SQL Sudarshan.ppt
No SQL Databases.ppt
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
MongoDb - Details on the POC
No SQL - MongoDB

More from Chris Westin (20)

PDF
Data torrent meetup-productioneng
PDF
Gripshort
PPTX
Ambari hadoop-ops-meetup-2013-09-19.final
PDF
Cluster management and automation with cloudera manager
PDF
Building low latency java applications with ehcache
PDF
SDN/OpenFlow #lspe
ODP
cfengine3 at #lspe
PPTX
mongodb-aggregation-may-2012
PDF
Nimbula lspe-2012-04-19
PPTX
mongodb-brief-intro-february-2012
PDF
Stingray - Riverbed Technology
PPTX
MongoDB's New Aggregation framework
PPTX
Replication and replica sets
PPTX
Architecting a Scale Out Cloud Storage Solution
PPTX
FlashCache
PPTX
Large Scale Cacti
PPTX
Practical Replication June-2011
PPT
Ganglia Overview-v2
PPTX
MongoDB Aggregation MongoSF May 2011
ODP
Mysql Proxy Presentation Yahoo
Data torrent meetup-productioneng
Gripshort
Ambari hadoop-ops-meetup-2013-09-19.final
Cluster management and automation with cloudera manager
Building low latency java applications with ehcache
SDN/OpenFlow #lspe
cfengine3 at #lspe
mongodb-aggregation-may-2012
Nimbula lspe-2012-04-19
mongodb-brief-intro-february-2012
Stingray - Riverbed Technology
MongoDB's New Aggregation framework
Replication and replica sets
Architecting a Scale Out Cloud Storage Solution
FlashCache
Large Scale Cacti
Practical Replication June-2011
Ganglia Overview-v2
MongoDB Aggregation MongoSF May 2011
Mysql Proxy Presentation Yahoo

Recently uploaded (20)

DOCX
search engine optimization ppt fir known well about this
PDF
Architecture types and enterprise applications.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Getting Started with Data Integration: FME Form 101
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
The various Industrial Revolutions .pptx
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Five Habits of High-Impact Board Members
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
CloudStack 4.21: First Look Webinar slides
PPT
Geologic Time for studying geology for geologist
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
Benefits of Physical activity for teenagers.pptx
search engine optimization ppt fir known well about this
Architecture types and enterprise applications.pdf
sustainability-14-14877-v2.pddhzftheheeeee
A comparative study of natural language inference in Swahili using monolingua...
A review of recent deep learning applications in wood surface defect identifi...
Module 1.ppt Iot fundamentals and Architecture
Getting Started with Data Integration: FME Form 101
Developing a website for English-speaking practice to English as a foreign la...
The various Industrial Revolutions .pptx
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Enhancing emotion recognition model for a student engagement use case through...
Tartificialntelligence_presentation.pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
Five Habits of High-Impact Board Members
A contest of sentiment analysis: k-nearest neighbor versus neural network
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
CloudStack 4.21: First Look Webinar slides
Geologic Time for studying geology for geologist
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Benefits of Physical activity for teenagers.pptx

MongoDB: An Introduction - june-2011

  • 1. MongoDB: An IntroductionChris WestinSoftware Engineer, 10gen© Copyright 2010 10gen Inc.
  • 2. OutlineThe Whys of Non-Relational DatabasesVocabulary of the Non-Relational WorldMongoDB
  • 3. Why did non-relational databases arise?Problems with relational databases in the web worldThe Whys of Non-Relational Databases
  • 4. Problem - Schema EvolutionApplications are evolving all the timeApplications need new fieldsApplications need new indexesData is growing – sometimes very fastUsers need to be able to alter their schemas without making their data unavailableThe web world expects 24x7 serviceRDBMSs can have a hard time doing this
  • 5. Problem – Write RatesReplication is a solution for high read loadsSooner or later, writing becomes a bottleneckSharding – partitioning a logical database across multiple database instancesJoins and aggregation become a problemDistributed transactions are too slow for the webManual management of shardsChoosing shard partitionsRebalancing shards
  • 6. An introduction to terminology you’re going to be seeing a lotVocabulary of the Non-Relational World
  • 7. Data ModelsA non-relational database’s data model determines the kinds of items it can contain and how they can be retrievedWhat can the system store, and what does it know about what it contains?The relational data model is about storing records made up of named, scalar-valued fields, as specified by a schema, or type definitionWhat kind of queries can you do?SQL is a manifestation of the kinds of queries that fall out of relational algebra
  • 8. Non-Relational Data ModelsKey-value storesDocument storesColumn-oriented databasesGraph databases
  • 9. Key-Value StoresA mapping from a key to a valueThe store doesn’t know anything about the the key or valueThe store doesn’t know anything about the insides of the valueOperationsSet, get, or delete a key-value pair
  • 10. Document StoresThe store is a container for documentsDocuments are made up of named fieldsFields may or may not have type definitionse.g. XSDs for XML stores, vs. schema-less JSON storesCan create “secondary indexes”These provide the ability to query on any document field(s)Operations:Insert and delete documentsUpdate fields within documents
  • 11. Column-Oriented StoresLike a relational store, but flipped around: all data for a column is kept togetherAn index provides a means to get a column value for a recordOperations:Get, insert, delete records; updating fieldsStreaming column data in and out of Hadoop
  • 12. Graph DatabasesStores vertex-to-vertex edgesOperations:Getting and setting edgesSometimes possible to annotate vertices or edgesQuery languages support finding paths between vertices, subject to various constraints
  • 13. Consistency ModelsRelational databases support transactionsCan only see committed changesCommit/abort span multiple changesRead-only transaction flavorsRead committed, repeatable read, etcClassic assumption: “I’m querying the one-and-only database”Scaling reads and writes introduce different problems
  • 14. Replication - The 1st Breakdown of Consistency
  • 15. Limitations of a Single MasterReplication can provide arbitrary read scalabilitySubject to coping with read-consistency issuesSooner or later, writing becomes a bottleneckPhysical limitations (seek time)Throughput of a single I/O subsystem
  • 16. ShardingParitition the primary key space via hashingSet up a duplicate system for each shardThe write-rate limitation now applies to each shardJoins or aggregation across shards are problematicCan the data be re-sharded on a live system?Can shards be re-balanced on a live system?
  • 17. Multi-Site OperationFailure of a single-master system’s masterA new master can be chosenBut what if there’s a network partition?Can the application continue in read-only mode?
  • 18. DynamoNow a generic term for multi-master systemsWrites can occur to any nodeThe same record can be updated on different nodes by different clientsAll writes are replicated everywhere
  • 19. Dynamo – the 2nd breakdown of consistencyCollisions can occurWho wins?A collision resolution strategy is requiredVector clockshttp://en.wikipedia.org/wiki/Vector_clockApplication access must be aware of this
  • 21. Key Client Implementation ConcernsMonotonic readsCan my reads go back in time?Read-your-own-writesIf I issue a query immediately after an insert or update, will I see my changes?Uninterrupted writesAm I always guaranteed the ability to write?Conflict ResolutionDo I need to have a conflict resolution strategy?
  • 22. Using a Single-Master SystemWhat does the intermediate agent or system do for…Monotonic reads?Read-your-own-writes?Uninterrupted writes?Conflict Resolution?
  • 23. Using a Multi-Master SystemWhat does the intermediate agent or system do for…Monotonic reads?Read-your-own-writes?Uninterrupted writes?Conflict Resolution?
  • 24. Where MongoDB fits in the non-relational worldMongoDB’s architecture and featuresSome real-world usersMongoDB
  • 25. MongoDB is a Document StoreMongoDB stores JSON objects as BSON{ LastName: ‘Flintstone’, FirstName: ‘Fred’, …}Secondary Indexesdb.collection.ensureIndex({LastName : 1, FirstName : 1});Simple QBE-like query syntaxdb.collection.find({LastName : ‘Flintstone’});db.collection.find({LastName : { $gte : ‘Flintstone’});
  • 26. MongoDB – Advanced QueriesGeo-spatial queriesCreate a geo indexFind points near a given point, sorted by radial distanceCan be planar or sphericalFind points within a certain radial distance, within a bounding box, or a polygonBuilt-in Map-ReduceThe caller provides map and reduce functions written in JavaScript
  • 27. MongoDB is a Single-Master SystemA database is served by members of a “replica set”The system elects a primary (master)Failure of the master is detected, and a new master is electedApplication writes get an error if there is no quorum to elect a new masterReads continue to be fulfilled
  • 29. MongoDB Supports ShardingA collection can be shardedEach shard is served by its own replica setNew shards (each a replica set) can be added at any timeShard key ranges are automatically balanced
  • 30. MongoDB – Sharded Deployment
  • 31. MongoDB Storage ManagementData is kept in memory-mapped filesServers should have a lot of memoryFiles are allocated as neededDocuments in a collection are kept on a list using a geographical addressing schemeIndexes (B*-trees) point to documents using geographical addresses
  • 32. MongoDB Server ManagementReplica set members are aware of each otherA majority of votes is required to elect a new primaryMembers can be assigned priorities to affect the electione.g., an “invisible” replica can be created with zero priority for backup purposes
  • 33. MongoDB AccessDrivers are available in many languages10gen supportedC, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, ScalaCommunity supportedClojure, ColdFusion, F#, Go, Groovy, Lua, Rhttp://www.mongodb.org/display/DOCS/Overview+-+Writing+Drivers+and+Tools
  • 34. MongoDB AvailabilitySourcehttps://github.com/mongodb/mongoServerLicense: AGPLhttp://www.mongodb.org/downloadsDriversLicense: Apachehttp://www.mongodb.org/display/DOCS/Drivers
  • 35. MongoDB – Hosted Serviceshttp://www.mongodb.org/display/DOCS/Hosting+CenterMongoHQ, Mongo Machine, MongoLabRESTful access to collections
  • 36. MongoDB SupportPaid Supporthttp://www.10gen.com/client-portal10gen Hosted MonitoringConsulting, trainingFree Supporthttp://groups.google.com/group/mongodb-userhttp://stackoverflow.com/questions/tagged/mongodb
  • 37. MongoDB Usershttp://www.10gen.com/customershttp://www.10gen.com/presentationscraigslist: http://www.10gen.com/presentation/mongosf2011/craigslistbit.ly: http://blip.tv/mongodb/bit-ly-user-history-auto-sharded-3723147shutterfly: http://www.10gen.com/presentation/mongosv2010/shutterfly