SlideShare a Scribd company logo
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDBFor BillRun!
Copyrights © Moshe Kaplan
moshe.kaplan@brightaqua.com
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB
For BillRun!
Moshe Kaplan
Scale Hacker
http://top-performance.blogspot.com
http://blogs.microsoft.co.il/vprnd
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It’s all About
3
Scale
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NOSQL. ANSWER A NEED
Introduction
4
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
5
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Consumer Revolution
6
http://topyaps.com/wp-content/uploads/2013/03/You-are-the-
product.-You-feeling-something.jpg
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
At the fraction of the cost…
7
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
8
http://lifehacker.com/5697167/if-youre-not-paying-for-it-
youre-the-product
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Transportation
9
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Moovit
10
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Medical Market Opportunities
11
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MediSafe
12
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
13
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Askem
14
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Major Enablers:
Mobile, Cloud and IT Commoditization
15
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Prime Suspect
16
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
17
Assumptions…
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Get an Answer, Fast and Cheap
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
I Just Want “Class Persistency
Storage” and Changing Schema on
Demand
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Be Always Available, Even w/ an Old
Answer
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Get Me Fast and Good Enough
Answer
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Data is Too Big, and Storage is $$$
But CPU and Network are Even More
http://www.powerbyte.com/Isilon.html
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Software Providers
23
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need to Meet Compliance
http://www.vision7.com/app_system/lib/image/content/PCI_compliance.jpg
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need a Vendor
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need Reporting
http://www.novell.com/communities/node/5851/get-ready-sentinel-61
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need Transactions
http://www.novell.com/communities/node/5851/get-ready-sentinel-61
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
We Need Training for the Data Analysts
db.article.aggregate(
{ $group : {
_id : "$author",
docsPerAuthor : { $sum : 1 },
viewsPerAuthor : { $sum : "$pageViews" }
}}
);
< SUM(pageViews)
< SUM(1) = N
< GROUP BY author
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NOSQL MARKET
Introduction
29
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
When Should I Choose NoSQL?
• Eventually Consistent
• Document Store
• Key Value
30
http://guyharrison.squarespace.com/blog/tag/nosq
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Key Value Store
• insert
• get
• multiget
• remove
• truncate
31
<Key, Value>
http://wiki.apache.org/cassandra/API
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Redis
• Very simple protocol (SMTP like)
• Amazing Performance (60Kqps ops on 1 CPU machine)
• Persistency to disk
• Very little security
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Column Family Stores:
Key Value Store (with benefits)
• insert
• get
• multiget
• remove
• truncate
33
http://wiki.apache.org/cassandra/API
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Cassandra
• Simple protocol
• Very Good Performance
• You have indexes (but limited)
• Data Model is a pain
• You need to design you data for queries:
“Table per Query”
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Document Databases
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [
"Turing machine",
"Turing test",
"Turingery"
],
views : NumberLong(1250000)
}
35
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Database for Software Engineers
Class
Subclass
Document
Subdocument
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
37
MapReduce
http://blogs.microsoft.co.il/blogs/vprnd
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
HELLO. MY NAME IS MONGODB
Introduction
38
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
39
#5 Most Popular DB Engine
http://db-engines.com/en/ranking
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Who is Using mongoDB?
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Who is Behind mongoDB
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Why MongoDB?
What? Why?
JSON End to End
No Schema “No DBA”, Just Serialize
Write 10K Inserts/sec on virtual machine
Read Similar to MySQL
HA 10 min to setup a cluster
Sharding Out of the Box
GeoData Great for that
No Schema None: no downtime to create new columns
Buzz Trend is with NoSQL
42
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
What mongoDB is Made of?
43
http://www.10gen.com/products/mongodb
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Installation: Give Yourself 5min
• Add to /etc/yum.repos.d/10gen.repo
• [10gen]
• name=10gen Repository
• baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64
• gpgcheck=0
• enabled=1
• yum –y install mongo-10gen mongo-10gen-server
• The Packages:
• mongo-10gen: tools
• mongo-10gen-server: mongod and mongos
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Ubuntu Way
• sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv
7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0
multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get -y update
sudo apt-get install -y mongodb-org
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Installation w/ Authentication
• /etc/mongod.conf
• > mongo
• use admin
db.createUser(
{
user: "siteUserAdmin",
pwd: “Pss0rdxxx",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
} )
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Mastering a New Query
Language
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Connect to the Database
• Connect:
• > mongo
• Show current database:
• >> db
• Show Databases
• >> show databases;
• Show Collections
• >> show collections; or show tables;
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Databases Manipulation: Create & Drop
• Change Database:
• >> use <database>
• Create Database
• Just switch and create an object…
• Delete Database
• > use mydb;
• > db.dropDatabase();
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Collections Manipulation
• Create Collcation
>db.createCollection(collectionName)
• Delete Collection
> db.collectionName.drop()
Or just insert to it
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SELECT: No SQL, just ORM…
• Select All
• db.things.find()
• WHERE
• db.posts.find({“comments.email” : ”b@c.com”})
• Pattern Matching
• db.posts.find( {“title” : /mongo/i} )
• Sort
• db.posts.find().sort({email : 1, date : -1});
• Limit
• db.posts.find().limit(3)
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NoSQL and Data Modeling
What is the Difference
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Database for Software Engineers
Class
Subclass
Document
Subdocument
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Same Terminology
• Database  Database
• Table  Collection
• Row  Document
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
A Blog Case Study in MySQL
http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
as a SW Engineer would like it to be…
http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Migration from RDBMS to NoSQL
How to do that?
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Migration
• Map the table structure
• Export the data and Import It
• Add Indexes
58
http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Selected Migration Tool
59
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Usage Details> Install ruby
> gem install mongify
… Modify the code to your needs
… Create configuration files
> mongify translation db.config >
translation.rb
> mongify process db.config translation.rb
60
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Date Functions
• Year(), Month()… function included
• … buy only in the JavaScript engine
• Solution: New fields!
• [original field]
• [original field]_[year part]
• [original field]_[month part]
• [original field]_[day part]
• [original field]_[hour part]
61
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NO SCHEMA IS A GOOD THING BUT…
Schemaless
62
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Default Values
• No Schema
• No Default Values
• App Challenge
• Timestamps…
No single source of truth
63
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Casting and Type Safety
• No Schema
• No …
• App Challenge
64
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Auto Numbers
• Start using _id
{
"_id" : 0,
"health" : 1,
"stateStr" : "PRIMARY",
"uptime" : 59917
}
• Counter tables
• Dedicated database
• 1:1 Mapping
• Counter++ using findAndModify
65
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
ORM Solution
66
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Analysts
67
http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Analysts
• This is not SQL
• There are no joins
• No perfect tools
68
Pentaho
RockMongoMongoVUE RoboMongo
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
No Joins
• Do in the application
• Leverage the power of NoSQL
69
http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Limited Resultset
70
• 16MB document size
• GridFS
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Bottom Line
• Powerful tool
• Embrace the Challenge
• Schema-less limitations: counters, data types
• Tools for Data Scientists
• Data design
71
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Billing Data Model
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Design Model
• balances
• bills
• lines
• plans
• queue
• rates
• subscribers
• users
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Mastering a New Query
Language
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Connect to the Database
• Connect:
• > mongo
• Show current database:
• >> db
• Show Databases
• >> show databases;
• Show Collections
• >> show collections; or show tables;
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Databases Manipulation: Create & Drop
• Change Database:
• >> use <database>
• Create Database
• Just switch and create an object…
• Delete Database
• > use mydb;
• > db.dropDatabase();
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Collections Manipulation
• Create Collcation
>db.createCollection(collectionName)
• Delete Collection
> db.collectionName.drop()
Or just insert to it
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SELECT: No SQL, just ORM…
• Select All
• db.things.find()
• WHERE
• db.posts.find({“comments.email” : ”b@c.com”})
• Pattern Matching
• db.posts.find( {“title” : /mongo/i} )
• Sort
• db.posts.find().sort({email : 1, date : -1});
• Limit
• db.posts.find().limit(3)
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Specific fields
Select All
db.users.find(
{ },
{ user_id: 1, status: 1, _id: 0 }
)
1: Show; 0: don’t show
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
WHERE
• != “A” { $ne: "A" }
• > 25 { $gt: 25 }
• > 25 AND <= 50 { $gt: 25, $lte: 50 }
• Like ‘bc%’ /^bc/
• < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Join
• Wrong Place…
• Or Map Reduce
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
82
 db.article.aggregate(
 { $group : {
 _id : { author : "$author“, name : “$name” },
 docsPerAuthor : { $sum : 1 },
 viewsPerAuthor : { $sum : "$pageViews" }
 }}
 );
GROUP BY
< GROUP BY author, name
< SUM(pageViews)
< SUM(1) = N
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
83
db.Movie.aggregate([
{$match:
{SeriesType : "F", MovieID : {$in : arrMovies}}
},
{$project:
{MovieID: "$MovieID", SeriesType: "$SeriesType",
Genres: "$Genres"}
},
{$unwind : "$Genres" },
{$group : { _id : "$Genres" , count : { $sum : 1 } } },
{$sort : { count: -1 }}
GROUP BY
WHERE
Keep some fields
Genres is an array
Counting and sorting
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Aggregation Framework Operators
Operator Description
$project Adding/Removing fields
$match WHERE
$redact Changes document based on Doc content/structure
$limit First N documents
$skip Skips N docs
$unwind Turns array into a multiple documents
$group Group
$sort Sort
$geoNear Geo spatial
$out Write Output to collection
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
85
db.posts.update(
{“comments.email”: ”b@c.com”},
{$set : {“comments.email”: ”d@c.com”}}
}
SET age = age + 3
• db.users.update(
• { status: "A" } ,
• { $inc: { age: 3 } },
• { multi: true }
• )
UPDATE
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
86
j = { name : "mongo" }
k = { x : 3 }
db.things.insert( j )
db.things.insert( k )
INSERT
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
87
db.users.remove(
{ status: "D" }
)
DELETE
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
88
Every operation on a document is atomic
Two Phase Commit implementation is up to
you
Atomic Transactions: Single Row
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
89
 Multiple documents at once
db.foo.update(
{ status : "A" , $isolated : 1 },
{ $inc : { count : 1 } },
{ multi: true }
)
 Disclaimers:
• Sharding is not supported
• Not all or nothing (no roll back on failure)
Atomic Transactions: $isolated
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
90
 t = db.transactions.findAndModify({
query: {
state: "initial“
},
update: {
$set: {
state: "pending"
},
$currentDate: { lastModified: true }
},
new: true
})
Atomic Transactions: findAndModify
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
91
 If it is about complex transactions.
 Simplify the case.
 or Consider keeping w/ RDBMS
Atomic Transactions: Bottom Line
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
92
 Failure and order:
• db.collection.initializeOrderedBulkOp()
• db.collection.initializeUnorderedBulkOp()
 1000 ops/bulk:
var bulk = db.items.initializeUnorderedBulkOp();
bulk.insert( { item: "abc123", defaultQty: 100, status: "A", points: 100 } );
bulk.insert( { item: "ijk123", defaultQty: 200, status: "A", points: 200 } );
bulk.insert( { item: "mop123", defaultQty: 0, status: "P", points: 0 } );
bulk.execute();
Bulk Operations
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
93
 Create a new project
 Get the Maven configuration for MongoDB Java Driver
• http://mongodb.github.io/mongo-java-driver/
Project Setup
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
94
List l = new ArrayList();
/**** Insert ****/
// create a document to store key and value
for (int i = 1; i < 1000000; ++i) {
Document document = new Document()
.append("name", "Moshe Kaplan")
.append("age", 36 + i)
.append("createdDate", new Date());
l.add(document);
}
table.insertMany(l);
Bulk Ops in Java
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
95
List<String> continentList = Arrays.asList(new String[]{"Africa", "Europe", "Asia"});
DBObject match = new BasicDBObject("$match", new BasicDBObject("continent.name", new BasicDBObject("$in",
continentList)));
DBObject projectFields = new BasicDBObject("continent.name", 1);
projectFields.put("area", 1);
projectFields.put("_id", 0);
DBObject project = new BasicDBObject("$project", projectFields );
DBObject groupFields = new BasicDBObject( "_id", "$continent.name");
groupFields.put("average", new BasicDBObject( "$avg", "$area"));
DBObject group = new BasicDBObject("$group", groupFields);
List agList = new ArrayList();
agList.add(match);
agList.add(project);
agList.add(group);
MongoCursor<Document> cursor = countries.aggregate(agList).iterator();
while (cursor.hasNext()) {
System.out.println(cursor.next());
}
Aggregation Framework in Java
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Performance Tuning
Make a Change
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MONGODB TUNING
97
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
journalCommitInterval = 300:
Write to disk: 2ms <= t <= 300ms
Default 100ms, increase to 300ms to save resources
Disk
The Journal
98
Memory
Journal Data
1 2
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
RAM Optimization:
dataSize + indexSize < RAM
99
OS
Data Index
Journal
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
PROFILING AND SLOW LOG
100
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Profiling Configuration
• Enable:
• mongod --profile=1 --slowms=15
• db.setProfilingLevel([level] , [time])
• How much:
• 0 (none)  1 (slow queries only)  2 (all)
• 100ms: default
• Where:
• system.profile collection @ local db
101
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Profiling Results Analysis
• Last 5 >1ms: show profile
• w/o commands:
db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
• Specific database:
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
• Slower than:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
• Between dates:
db.system.profile.find({ts : {
$gt : new ISODate("2012-12-09T03:00:00Z") ,
$lt : new ISODate("2012-12-09T03:40:00Z")
}}).pretty()
102
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Explain
> db.courses.find().explain();
{ "cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 11, “nscannedObjects" : 11, "nscanned" : 11,
"nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11,
"scanAndOrder" : false, "indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {},
"server" : "primary.domain.com:27017"
}
103
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
INDEXES
104
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Index Management
• Regular Index
• db.users.createIndex( { user_id: 1 } )
• db.users.ensureIndex( { user_id: 1 } )
• Multiple + DESC Index
• db.users.ensureIndex( { user_id: 1, age: -1 } )
• Sub Document Index
• db.users.ensureIndex( { address.zipcode: 1 } )
• Unique Index
• db.users.ensureIndex( { address.zipcode: 1 } , { unique : true } )
• List Indexes
• db.users.getIndexes()
• Drop Indexes
• db.users.dropIndex(“indexName”)
105
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Known Index Issues
• Bound filter should be the last (in the index as well).
• BitMap Indexes not really working
• You should design your indexes carefully
106
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Dex: The Index Analyzer
• Installation:
• sudo apt-get -y install python-pip
sudo pip install dex
• Running:
• dex [mongodb_uri] (-f <logfile_path> | -p) [<options>]
• dex -w -p -n "testdb.*" mongodb://127.0.0.1/testdb -f
/var/log/mongodb/mongod.log
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mtools: Visualize and Analyze Logs
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Capped Collections
• Fixed size collections
• Circular buffers like
• High throughput operations
• Order guarantee
db.createCollection("mycoll", {capped: true, size:100000})
db.cappedCollection.find().sort( { $natural: -1 } )
• Case studies:
• Logs
• Cache
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
TTL
• Remove Old Data Automatically
• db.log_events.createIndex(
{ "createdAt": 1 }, { expireAfterSeconds: 3600 }
)
• db.log_events.insert( {
"expireAt": new Date('July 22, 2013 14:00:00'),
"logEvent": 2,
"logMessage": "Success!“
} )
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
ENVIRONMENT TUNING
111
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
TTL
• # For SSD only
• blockdev --setra 16 /dev/sdb
• blockdev --setra 16 /dev/dm-2
• # For all cluser mongod & mongos
• for i in /sys/kernel/mm/*transparent_hugepage/enabled;
do echo never > $i; done
• for i in /sys/kernel/mm/*transparent_hugepage/defrag;
do echo never > $i; done
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
STATS &
SCHEMA DESIGN
113
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Sparse Matrix? I don’t Think so
• mongostat
• > db.stats();
• > db.collectionname.stats();
• Fragmentation if storageSize/size > 2
• db.collectionanme.runCommand(“compact”)
• Padding (wrong design) if paddingFactor > 2
114
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
High Availability
Going Real Time
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
(Do Not) Master/Slave
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
• In mongo.conf
• # Replication Options
• replSet=myReplSet
• > rs.initiate()
• > rs.conf()
• > rs.add(“host:port")
• rs.reconfig()
Replication Set
117
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
• rs.addArb(“host:port")
• Also:
• Low Priority
• Hidden
• (Weighted) Voting
Arbiter
118
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Show Status: rs.status();
• {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"),
• "myState" : 1,
• "members" : [
• {
• "_id" : 0, "name"
: "primary.example.com:27017",
• "health" : 1, "state" :
1,
• "stateStr" : "PRIMARY",
"uptime" : 164545,
• "optime" : Timestamp(1359901753000, 1),
• "optimeDate" : ISODate("2013-02-
03T14:29:13Z"), "self" : true
• },
• {
• "_id" : 1, "name"
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Replica Set Recovery
• Create a new mongod
• Either install a plain vanilla
• Or duplicate existing mongod (better)
• Connect to the system
• Use the previous machine IP
• Or change configuration to remove old and add new
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Sharding and Scale out:
Make a big Change
Map Reduce and Aggregation
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Secondary Read Enabling
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Strategy : Sharding
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB Implementation
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 1: Create a Config ReplicaSet
• mkdir /data/configdb
• mongod --configsvr --dbpath /data/configdb --port 27019
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 2: Install Mongos
• mongos --configdb config01:27019, config02:27019, config03:27019
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 3: Add Shards
• Connect a mongos
• Add Shard
• sh.addShard( "rs1/mongodb0.example.net:27017" )
• sh.addShard( "mongodb0.example.net:27017" )
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 4: Enable Sharding
• sh.enableSharding("<database>")
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 5: Sharding Colleciton
• sh.shardCollection("<database>.<collection>", shard-key-pattern)
• sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
• Keys:
• High Cardinality to enable split
• Use common query field
• Use Compound indexes for sharding
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
BACKUP AND MONITORING
130
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
First Option – Single Server
Logical Backup Physical Backup
Method mongodump Point in time snapshot (using LVM tools) or disk image/copy
(using AWS or Azure “external” tools)
Pros Low costs Low costs
Cons • Downtime: Long;
• Duration: Long (slow backup since logical data needs to be
extracted);
• Performance impact: High (slows the disks and may stuck the
machine on heavy used machines);
• Data consistency: Intact;
• Differential: Supported;
• Sharding: Supported;
• Downtime: OS and/or infrastructure depended;
• Duration: Short (faster backup since only data blocks are
copied);
• Performance impact: Unknown (depends on OS and/or
infrastructure);
• Data consistency: Unknown state;
• Differential: Infrastructure depended;
• Sharding: Unsupported;
131
Sharding: is a type of database partitioning that separates very large databases
the into smaller, faster, more easily managed parts called data shards. The word
shard means a small part of a whole..
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SECOND Option – REPLICA SET
Logical Backup Physical Backup
Method mongodump Stop slave and copy its disk
Pros • Downtime: None (backup is performed using Slave server –
Master server is always up);
• Duration: Not significant (backup is performed using Slave
server);
• Performance impact: None (backup is performed using Slave
server – Master server is not impacted);
• Data consistency: Intact;
• Differential: Supported;
• Sharding: Supported;
• Downtime: None
• Duration: Not significant
Cons Very high costs – requires two additional servers. A slave
server of the same type and size as the master server; and a
small arbiter server (used as a secondary verification for
Master server availability tests and “voting”).
• Costs: Requires a dedicated server per replica set
132
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
THIRD OPTION - MongoDB MMS
• Part of the MongoDB Enterprise Edition or as a Cloud Service
• The Cloud Service offer
• $50/month/node
• $2.5/GB/Month backup.
• A valid go to market way of MongoDB
for upsale
• MMS Features
• Point in time recovery
• Daily snapshots
• Detailed monitoring
• Alerts
133
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
How to Enable Incremental Backup
• In Backup
• Use the --oplog flag when doing mongodump
• Dump each hour the local.oplog collection
• In recovery
• mongorestore --oplogReplay
• applyOps to implement hourly dump
134
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mongostat
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mongotop
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
db.serverStatus()
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
db.stats() and db.collection.stats()
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
rs.status()
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
STORAGE ENGINES
140
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MMAPv1
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB 3.0 and WiredTiger
• MongoDB version 3.0 supports new storage engine
(WiredTiger):
• Disk Compression
• Heavy write
• Document level locking
• File per collection
• Server wide selection:
• config.yaml
• launch w/ --storageEngine = wiredTiger
142
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB Pluggable Architecture
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Engines Comparison
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
YAML Based Configuration
storage:
dbPath: "/var/lib/mongodbwt"
directoryPerDB: true
engine: "wiredTiger"
wiredTiger:
engineConfig:
cacheSizeGB: 16
journalCompressor: zlib
directoryForIndexes: true
collectionConfig:
blockCompressor: zlib
indexConfig:
prefixCompression: true
systemLog:
destination: file
path: "/var/log/mongodb/mongod.log"
logAppend: true
timeStampFormat: iso8601-local
processManagement:
fork: true
pidFilePath: "/var/run/mongodb.pid"
#security:
# keyFile: "/etc/mongo.key"
# authorization: "enabled"
replication:
replSetName: "arp0"
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SECURITY
146
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Providing Permissions
• use admin
db.createUser( {
user: "siteUserAdmin", pwd: "password",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
} )
• use records
db.createUser( {
user: "recordsUserAdmin", pwd: "password",
roles: [ { role: "userAdmin", db: "records" } ]
} )
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Roles
Read
readWrite
dbAdmin
dbOwner
userAdmin
clusterAdmin, clusterManager, …
backup, restore
readAnyDatabase, readWriteAnyDatabase
root
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Granular Actions
use admin
db.createRole(
role: "manageOpRole",
privileges: [
{ resource: { cluster: true }, actions: [ "killop", "inprog" ] },
{ resource: { db: "", collection: "" }, actions: [ "killCursors" ] }
],
roles: []
}
)
© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Thank You !
Moshe Kaplan
moshe.kaplan@brightaqua.com
054-2291978

More Related Content

Similar to MongoDB from Basics to Scale (20)

PDF
MongoDB training for java software engineers
Moshe Kaplan
 
PDF
Redis training for java software engineers
Moshe Kaplan
 
PPTX
MongoDB Best Practices for Developers
Moshe Kaplan
 
PDF
Lying, Cheating, and Winning with Containers in Networking
Sargun Dhillon
 
PDF
Beyond Mirai: The new age of MDDoS attacks
APNIC
 
PDF
Git Tutorial
Moshe Kaplan
 
PDF
Introduciton to Python
Moshe Kaplan
 
PDF
Cloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2U
Sufyaan Kazi
 
PDF
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
NETWAYS
 
PDF
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
QAware GmbH
 
PPTX
Big Data Workshop
Moshe Kaplan
 
PDF
Introduction to Big Data
Moshe Kaplan
 
PDF
The Open Sourcing of Infrastructure
All Things Open
 
PDF
KubeCon EU 2016: A lightweight deployment system for appops
KubeAcademy
 
PDF
View Page Update Presentation Close Bangalore Executive Seminar 2015: Welcom...
MongoDB
 
PDF
SMACK stack and beyond
Matt Jarvis
 
PDF
今すぐ始めるCloud Foundry #hackt #hackt_k
Toshiaki Maki
 
PDF
Microservices Manchester: Keynote. Microservices are so 2015, What's Next? By...
OpenCredo
 
PPTX
MongoDB at Scale!
Aveekshith Bushan
 
PDF
Containerizing couchbase with microservice architecture on mesosphere.pptx
Ravi Yadav
 
MongoDB training for java software engineers
Moshe Kaplan
 
Redis training for java software engineers
Moshe Kaplan
 
MongoDB Best Practices for Developers
Moshe Kaplan
 
Lying, Cheating, and Winning with Containers in Networking
Sargun Dhillon
 
Beyond Mirai: The new age of MDDoS attacks
APNIC
 
Git Tutorial
Moshe Kaplan
 
Introduciton to Python
Moshe Kaplan
 
Cloud Foundry vs Docker vs Kubernetes - http://bit.ly/2rzUM2U
Sufyaan Kazi
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
NETWAYS
 
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
QAware GmbH
 
Big Data Workshop
Moshe Kaplan
 
Introduction to Big Data
Moshe Kaplan
 
The Open Sourcing of Infrastructure
All Things Open
 
KubeCon EU 2016: A lightweight deployment system for appops
KubeAcademy
 
View Page Update Presentation Close Bangalore Executive Seminar 2015: Welcom...
MongoDB
 
SMACK stack and beyond
Matt Jarvis
 
今すぐ始めるCloud Foundry #hackt #hackt_k
Toshiaki Maki
 
Microservices Manchester: Keynote. Microservices are so 2015, What's Next? By...
OpenCredo
 
MongoDB at Scale!
Aveekshith Bushan
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Ravi Yadav
 

More from Moshe Kaplan (20)

PDF
Spark and C Integration
Moshe Kaplan
 
PDF
Creating Big Data: Methodology
Moshe Kaplan
 
PPTX
The api economy
Moshe Kaplan
 
PPT
Scale and Cloud Design Patterns
Moshe Kaplan
 
PPTX
Introduction to MongoDB
Moshe Kaplan
 
PPT
Web systems architecture, Performance and More
Moshe Kaplan
 
PPTX
Do Big Data and NoSQL Fit Your Needs?
Moshe Kaplan
 
PPTX
The VP R&D Open Seminar on Project Management, SCRUM, Agile and Continuous De...
Moshe Kaplan
 
PPTX
MySQL Multi Master Replication
Moshe Kaplan
 
PDF
mongoDB Performance
Moshe Kaplan
 
PPT
Web Systems Architecture by Moshe Kaplan
Moshe Kaplan
 
PPTX
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Moshe Kaplan
 
PPT
MySQL crash course by moshe kaplan
Moshe Kaplan
 
PPT
VP R&D Open Seminar: Caching
Moshe Kaplan
 
PPT
Expert Days: The VP R&D Open Seminar: Project Management
Moshe Kaplan
 
PPT
Expert Days 2011: The VP R&D Open Seminar: Systems Performance Seminar
Moshe Kaplan
 
PPT
Database2011 MySQL Sharding
Moshe Kaplan
 
PPT
Cloud Computing Design Best Practices
Moshe Kaplan
 
PPT
Better Gantts and Project Management
Moshe Kaplan
 
PPT
Better Gantts and Project Management
Moshe Kaplan
 
Spark and C Integration
Moshe Kaplan
 
Creating Big Data: Methodology
Moshe Kaplan
 
The api economy
Moshe Kaplan
 
Scale and Cloud Design Patterns
Moshe Kaplan
 
Introduction to MongoDB
Moshe Kaplan
 
Web systems architecture, Performance and More
Moshe Kaplan
 
Do Big Data and NoSQL Fit Your Needs?
Moshe Kaplan
 
The VP R&D Open Seminar on Project Management, SCRUM, Agile and Continuous De...
Moshe Kaplan
 
MySQL Multi Master Replication
Moshe Kaplan
 
mongoDB Performance
Moshe Kaplan
 
Web Systems Architecture by Moshe Kaplan
Moshe Kaplan
 
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Moshe Kaplan
 
MySQL crash course by moshe kaplan
Moshe Kaplan
 
VP R&D Open Seminar: Caching
Moshe Kaplan
 
Expert Days: The VP R&D Open Seminar: Project Management
Moshe Kaplan
 
Expert Days 2011: The VP R&D Open Seminar: Systems Performance Seminar
Moshe Kaplan
 
Database2011 MySQL Sharding
Moshe Kaplan
 
Cloud Computing Design Best Practices
Moshe Kaplan
 
Better Gantts and Project Management
Moshe Kaplan
 
Better Gantts and Project Management
Moshe Kaplan
 
Ad

Recently uploaded (20)

PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PPTX
Engineering the Java Web Application (MVC)
abhishekoza1981
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PPTX
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PDF
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
Engineering the Java Web Application (MVC)
abhishekoza1981
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Import Data Form Excel to Tally Services
Tally xperts
 
Ad

MongoDB from Basics to Scale

  • 1. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MongoDBFor BillRun! Copyrights © Moshe Kaplan [email protected]
  • 2. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MongoDB For BillRun! Moshe Kaplan Scale Hacker http://top-performance.blogspot.com http://blogs.microsoft.co.il/vprnd
  • 3. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It’s all About 3 Scale
  • 4. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! NOSQL. ANSWER A NEED Introduction 4
  • 5. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 5
  • 6. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! The Consumer Revolution 6 http://topyaps.com/wp-content/uploads/2013/03/You-are-the- product.-You-feeling-something.jpg
  • 7. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! At the fraction of the cost… 7
  • 8. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 8 http://lifehacker.com/5697167/if-youre-not-paying-for-it- youre-the-product
  • 9. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Transportation 9
  • 10. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Moovit 10
  • 11. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! The Medical Market Opportunities 11
  • 12. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MediSafe 12
  • 13. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 13
  • 14. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Askem 14
  • 15. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Major Enablers: Mobile, Cloud and IT Commoditization 15
  • 16. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! The Prime Suspect 16
  • 17. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 17 Assumptions…
  • 18. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Where did it Fail? Get an Answer, Fast and Cheap
  • 19. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Where did it Fail? I Just Want “Class Persistency Storage” and Changing Schema on Demand
  • 20. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Where did it Fail? Be Always Available, Even w/ an Old Answer
  • 21. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Where did it Fail? Get Me Fast and Good Enough Answer
  • 22. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Where did it Fail? Data is Too Big, and Storage is $$$ But CPU and Network are Even More http://www.powerbyte.com/Isilon.html
  • 23. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Software Providers 23
  • 24. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It is all great, but… I Need to Meet Compliance http://www.vision7.com/app_system/lib/image/content/PCI_compliance.jpg
  • 25. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It is all great, but… I Need a Vendor
  • 26. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It is all great, but… I Need Reporting http://www.novell.com/communities/node/5851/get-ready-sentinel-61
  • 27. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It is all great, but… I Need Transactions http://www.novell.com/communities/node/5851/get-ready-sentinel-61
  • 28. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! It is all great, but… We Need Training for the Data Analysts db.article.aggregate( { $group : { _id : "$author", docsPerAuthor : { $sum : 1 }, viewsPerAuthor : { $sum : "$pageViews" } }} ); < SUM(pageViews) < SUM(1) = N < GROUP BY author
  • 29. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! NOSQL MARKET Introduction 29
  • 30. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! When Should I Choose NoSQL? • Eventually Consistent • Document Store • Key Value 30 http://guyharrison.squarespace.com/blog/tag/nosq
  • 31. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Key Value Store • insert • get • multiget • remove • truncate 31 <Key, Value> http://wiki.apache.org/cassandra/API
  • 32. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Redis • Very simple protocol (SMTP like) • Amazing Performance (60Kqps ops on 1 CPU machine) • Persistency to disk • Very little security
  • 33. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Column Family Stores: Key Value Store (with benefits) • insert • get • multiget • remove • truncate 33 http://wiki.apache.org/cassandra/API
  • 34. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Cassandra • Simple protocol • Very Good Performance • You have indexes (but limited) • Data Model is a pain • You need to design you data for queries: “Table per Query”
  • 35. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Document Databases var mydoc = { _id: ObjectId("5099803df3f4948bd2f98391"), name: { first: "Alan", last: "Turing" }, birth: new Date('Jun 23, 1912'), death: new Date('Jun 07, 1954'), contribs: [ "Turing machine", "Turing test", "Turingery" ], views : NumberLong(1250000) } 35
  • 36. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Database for Software Engineers Class Subclass Document Subdocument
  • 37. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 37 MapReduce http://blogs.microsoft.co.il/blogs/vprnd
  • 38. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! HELLO. MY NAME IS MONGODB Introduction 38
  • 39. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 39 #5 Most Popular DB Engine http://db-engines.com/en/ranking
  • 40. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Who is Using mongoDB?
  • 41. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Who is Behind mongoDB
  • 42. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Why MongoDB? What? Why? JSON End to End No Schema “No DBA”, Just Serialize Write 10K Inserts/sec on virtual machine Read Similar to MySQL HA 10 min to setup a cluster Sharding Out of the Box GeoData Great for that No Schema None: no downtime to create new columns Buzz Trend is with NoSQL 42
  • 43. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! What mongoDB is Made of? 43 http://www.10gen.com/products/mongodb
  • 44. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Installation: Give Yourself 5min • Add to /etc/yum.repos.d/10gen.repo • [10gen] • name=10gen Repository • baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64 • gpgcheck=0 • enabled=1 • yum –y install mongo-10gen mongo-10gen-server • The Packages: • mongo-10gen: tools • mongo-10gen-server: mongod and mongos
  • 45. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! The Ubuntu Way • sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list sudo apt-get -y update sudo apt-get install -y mongodb-org
  • 46. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Installation w/ Authentication • /etc/mongod.conf • > mongo • use admin db.createUser( { user: "siteUserAdmin", pwd: “Pss0rdxxx", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } )
  • 47. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Mastering a New Query Language
  • 48. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Connect to the Database • Connect: • > mongo • Show current database: • >> db • Show Databases • >> show databases; • Show Collections • >> show collections; or show tables;
  • 49. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Databases Manipulation: Create & Drop • Change Database: • >> use <database> • Create Database • Just switch and create an object… • Delete Database • > use mydb; • > db.dropDatabase();
  • 50. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Collections Manipulation • Create Collcation >db.createCollection(collectionName) • Delete Collection > db.collectionName.drop() Or just insert to it
  • 51. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! SELECT: No SQL, just ORM… • Select All • db.things.find() • WHERE • db.posts.find({“comments.email” : ”[email protected]”}) • Pattern Matching • db.posts.find( {“title” : /mongo/i} ) • Sort • db.posts.find().sort({email : 1, date : -1}); • Limit • db.posts.find().limit(3)
  • 52. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! NoSQL and Data Modeling What is the Difference
  • 53. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Database for Software Engineers Class Subclass Document Subdocument
  • 54. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Same Terminology • Database  Database • Table  Collection • Row  Document
  • 55. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! A Blog Case Study in MySQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 56. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! as a SW Engineer would like it to be… http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 57. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Migration from RDBMS to NoSQL How to do that?
  • 58. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Data Migration • Map the table structure • Export the data and Import It • Add Indexes 58 http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
  • 59. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Selected Migration Tool 59
  • 60. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Usage Details> Install ruby > gem install mongify … Modify the code to your needs … Create configuration files > mongify translation db.config > translation.rb > mongify process db.config translation.rb 60
  • 61. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Date Functions • Year(), Month()… function included • … buy only in the JavaScript engine • Solution: New fields! • [original field] • [original field]_[year part] • [original field]_[month part] • [original field]_[day part] • [original field]_[hour part] 61
  • 62. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! NO SCHEMA IS A GOOD THING BUT… Schemaless 62
  • 63. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Default Values • No Schema • No Default Values • App Challenge • Timestamps… No single source of truth 63
  • 64. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Casting and Type Safety • No Schema • No … • App Challenge 64
  • 65. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Auto Numbers • Start using _id { "_id" : 0, "health" : 1, "stateStr" : "PRIMARY", "uptime" : 59917 } • Counter tables • Dedicated database • 1:1 Mapping • Counter++ using findAndModify 65
  • 66. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! ORM Solution 66
  • 67. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Data Analysts 67 http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
  • 68. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Data Analysts • This is not SQL • There are no joins • No perfect tools 68 Pentaho RockMongoMongoVUE RoboMongo
  • 69. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! No Joins • Do in the application • Leverage the power of NoSQL 69 http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 70. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Limited Resultset 70 • 16MB document size • GridFS
  • 71. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Bottom Line • Powerful tool • Embrace the Challenge • Schema-less limitations: counters, data types • Tools for Data Scientists • Data design 71
  • 72. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Billing Data Model
  • 73. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Design Model • balances • bills • lines • plans • queue • rates • subscribers • users
  • 74. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Mastering a New Query Language
  • 75. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Connect to the Database • Connect: • > mongo • Show current database: • >> db • Show Databases • >> show databases; • Show Collections • >> show collections; or show tables;
  • 76. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Databases Manipulation: Create & Drop • Change Database: • >> use <database> • Create Database • Just switch and create an object… • Delete Database • > use mydb; • > db.dropDatabase();
  • 77. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Collections Manipulation • Create Collcation >db.createCollection(collectionName) • Delete Collection > db.collectionName.drop() Or just insert to it
  • 78. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! SELECT: No SQL, just ORM… • Select All • db.things.find() • WHERE • db.posts.find({“comments.email” : ”[email protected]”}) • Pattern Matching • db.posts.find( {“title” : /mongo/i} ) • Sort • db.posts.find().sort({email : 1, date : -1}); • Limit • db.posts.find().limit(3)
  • 79. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Specific fields Select All db.users.find( { }, { user_id: 1, status: 1, _id: 0 } ) 1: Show; 0: don’t show
  • 80. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! WHERE • != “A” { $ne: "A" } • > 25 { $gt: 25 } • > 25 AND <= 50 { $gt: 25, $lte: 50 } • Like ‘bc%’ /^bc/ • < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
  • 81. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Join • Wrong Place… • Or Map Reduce
  • 82. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 82  db.article.aggregate(  { $group : {  _id : { author : "$author“, name : “$name” },  docsPerAuthor : { $sum : 1 },  viewsPerAuthor : { $sum : "$pageViews" }  }}  ); GROUP BY < GROUP BY author, name < SUM(pageViews) < SUM(1) = N
  • 83. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 83 db.Movie.aggregate([ {$match: {SeriesType : "F", MovieID : {$in : arrMovies}} }, {$project: {MovieID: "$MovieID", SeriesType: "$SeriesType", Genres: "$Genres"} }, {$unwind : "$Genres" }, {$group : { _id : "$Genres" , count : { $sum : 1 } } }, {$sort : { count: -1 }} GROUP BY WHERE Keep some fields Genres is an array Counting and sorting
  • 84. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Aggregation Framework Operators Operator Description $project Adding/Removing fields $match WHERE $redact Changes document based on Doc content/structure $limit First N documents $skip Skips N docs $unwind Turns array into a multiple documents $group Group $sort Sort $geoNear Geo spatial $out Write Output to collection
  • 85. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 85 db.posts.update( {“comments.email”: ”[email protected]”}, {$set : {“comments.email”: ”[email protected]”}} } SET age = age + 3 • db.users.update( • { status: "A" } , • { $inc: { age: 3 } }, • { multi: true } • ) UPDATE
  • 86. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 86 j = { name : "mongo" } k = { x : 3 } db.things.insert( j ) db.things.insert( k ) INSERT
  • 87. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 87 db.users.remove( { status: "D" } ) DELETE
  • 88. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 88 Every operation on a document is atomic Two Phase Commit implementation is up to you Atomic Transactions: Single Row
  • 89. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 89  Multiple documents at once db.foo.update( { status : "A" , $isolated : 1 }, { $inc : { count : 1 } }, { multi: true } )  Disclaimers: • Sharding is not supported • Not all or nothing (no roll back on failure) Atomic Transactions: $isolated
  • 90. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 90  t = db.transactions.findAndModify({ query: { state: "initial“ }, update: { $set: { state: "pending" }, $currentDate: { lastModified: true } }, new: true }) Atomic Transactions: findAndModify
  • 91. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 91  If it is about complex transactions.  Simplify the case.  or Consider keeping w/ RDBMS Atomic Transactions: Bottom Line
  • 92. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 92  Failure and order: • db.collection.initializeOrderedBulkOp() • db.collection.initializeUnorderedBulkOp()  1000 ops/bulk: var bulk = db.items.initializeUnorderedBulkOp(); bulk.insert( { item: "abc123", defaultQty: 100, status: "A", points: 100 } ); bulk.insert( { item: "ijk123", defaultQty: 200, status: "A", points: 200 } ); bulk.insert( { item: "mop123", defaultQty: 0, status: "P", points: 0 } ); bulk.execute(); Bulk Operations
  • 93. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 93  Create a new project  Get the Maven configuration for MongoDB Java Driver • http://mongodb.github.io/mongo-java-driver/ Project Setup
  • 94. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 94 List l = new ArrayList(); /**** Insert ****/ // create a document to store key and value for (int i = 1; i < 1000000; ++i) { Document document = new Document() .append("name", "Moshe Kaplan") .append("age", 36 + i) .append("createdDate", new Date()); l.add(document); } table.insertMany(l); Bulk Ops in Java
  • 95. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! 95 List<String> continentList = Arrays.asList(new String[]{"Africa", "Europe", "Asia"}); DBObject match = new BasicDBObject("$match", new BasicDBObject("continent.name", new BasicDBObject("$in", continentList))); DBObject projectFields = new BasicDBObject("continent.name", 1); projectFields.put("area", 1); projectFields.put("_id", 0); DBObject project = new BasicDBObject("$project", projectFields ); DBObject groupFields = new BasicDBObject( "_id", "$continent.name"); groupFields.put("average", new BasicDBObject( "$avg", "$area")); DBObject group = new BasicDBObject("$group", groupFields); List agList = new ArrayList(); agList.add(match); agList.add(project); agList.add(group); MongoCursor<Document> cursor = countries.aggregate(agList).iterator(); while (cursor.hasNext()) { System.out.println(cursor.next()); } Aggregation Framework in Java
  • 96. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Performance Tuning Make a Change
  • 97. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MONGODB TUNING 97
  • 98. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! journalCommitInterval = 300: Write to disk: 2ms <= t <= 300ms Default 100ms, increase to 300ms to save resources Disk The Journal 98 Memory Journal Data 1 2
  • 99. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! RAM Optimization: dataSize + indexSize < RAM 99 OS Data Index Journal
  • 100. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! PROFILING AND SLOW LOG 100
  • 101. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Profiling Configuration • Enable: • mongod --profile=1 --slowms=15 • db.setProfilingLevel([level] , [time]) • How much: • 0 (none)  1 (slow queries only)  2 (all) • 100ms: default • Where: • system.profile collection @ local db 101
  • 102. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Profiling Results Analysis • Last 5 >1ms: show profile • w/o commands: db.system.profile.find( { op: { $ne : 'command' } } ).pretty() • Specific database: db.system.profile.find( { ns : 'mydb.test' } ).pretty() • Slower than: db.system.profile.find( { millis : { $gt : 5 } } ).pretty() • Between dates: db.system.profile.find({ts : { $gt : new ISODate("2012-12-09T03:00:00Z") , $lt : new ISODate("2012-12-09T03:40:00Z") }}).pretty() 102
  • 103. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Explain > db.courses.find().explain(); { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 11, “nscannedObjects" : 11, "nscanned" : 11, "nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : {}, "server" : "primary.domain.com:27017" } 103
  • 104. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! INDEXES 104
  • 105. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Index Management • Regular Index • db.users.createIndex( { user_id: 1 } ) • db.users.ensureIndex( { user_id: 1 } ) • Multiple + DESC Index • db.users.ensureIndex( { user_id: 1, age: -1 } ) • Sub Document Index • db.users.ensureIndex( { address.zipcode: 1 } ) • Unique Index • db.users.ensureIndex( { address.zipcode: 1 } , { unique : true } ) • List Indexes • db.users.getIndexes() • Drop Indexes • db.users.dropIndex(“indexName”) 105
  • 106. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Known Index Issues • Bound filter should be the last (in the index as well). • BitMap Indexes not really working • You should design your indexes carefully 106
  • 107. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Dex: The Index Analyzer • Installation: • sudo apt-get -y install python-pip sudo pip install dex • Running: • dex [mongodb_uri] (-f <logfile_path> | -p) [<options>] • dex -w -p -n "testdb.*" mongodb://127.0.0.1/testdb -f /var/log/mongodb/mongod.log
  • 108. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! mtools: Visualize and Analyze Logs
  • 109. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Capped Collections • Fixed size collections • Circular buffers like • High throughput operations • Order guarantee db.createCollection("mycoll", {capped: true, size:100000}) db.cappedCollection.find().sort( { $natural: -1 } ) • Case studies: • Logs • Cache
  • 110. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! TTL • Remove Old Data Automatically • db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } ) • db.log_events.insert( { "expireAt": new Date('July 22, 2013 14:00:00'), "logEvent": 2, "logMessage": "Success!“ } )
  • 111. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! ENVIRONMENT TUNING 111
  • 112. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! TTL • # For SSD only • blockdev --setra 16 /dev/sdb • blockdev --setra 16 /dev/dm-2 • # For all cluser mongod & mongos • for i in /sys/kernel/mm/*transparent_hugepage/enabled; do echo never > $i; done • for i in /sys/kernel/mm/*transparent_hugepage/defrag; do echo never > $i; done
  • 113. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! STATS & SCHEMA DESIGN 113
  • 114. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Sparse Matrix? I don’t Think so • mongostat • > db.stats(); • > db.collectionname.stats(); • Fragmentation if storageSize/size > 2 • db.collectionanme.runCommand(“compact”) • Padding (wrong design) if paddingFactor > 2 114
  • 115. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! High Availability Going Real Time
  • 116. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! (Do Not) Master/Slave
  • 117. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! • In mongo.conf • # Replication Options • replSet=myReplSet • > rs.initiate() • > rs.conf() • > rs.add(“host:port") • rs.reconfig() Replication Set 117
  • 118. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! • rs.addArb(“host:port") • Also: • Low Priority • Hidden • (Weighted) Voting Arbiter 118
  • 119. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Show Status: rs.status(); • {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"), • "myState" : 1, • "members" : [ • { • "_id" : 0, "name" : "primary.example.com:27017", • "health" : 1, "state" : 1, • "stateStr" : "PRIMARY", "uptime" : 164545, • "optime" : Timestamp(1359901753000, 1), • "optimeDate" : ISODate("2013-02- 03T14:29:13Z"), "self" : true • }, • { • "_id" : 1, "name"
  • 120. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Replica Set Recovery • Create a new mongod • Either install a plain vanilla • Or duplicate existing mongod (better) • Connect to the system • Use the previous machine IP • Or change configuration to remove old and add new
  • 121. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Sharding and Scale out: Make a big Change Map Reduce and Aggregation
  • 122. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Secondary Read Enabling
  • 123. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! The Strategy : Sharding
  • 124. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MongoDB Implementation
  • 125. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Step 1: Create a Config ReplicaSet • mkdir /data/configdb • mongod --configsvr --dbpath /data/configdb --port 27019
  • 126. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Step 2: Install Mongos • mongos --configdb config01:27019, config02:27019, config03:27019
  • 127. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Step 3: Add Shards • Connect a mongos • Add Shard • sh.addShard( "rs1/mongodb0.example.net:27017" ) • sh.addShard( "mongodb0.example.net:27017" )
  • 128. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Step 4: Enable Sharding • sh.enableSharding("<database>")
  • 129. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Step 5: Sharding Colleciton • sh.shardCollection("<database>.<collection>", shard-key-pattern) • sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } ) • Keys: • High Cardinality to enable split • Use common query field • Use Compound indexes for sharding
  • 130. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! BACKUP AND MONITORING 130
  • 131. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! First Option – Single Server Logical Backup Physical Backup Method mongodump Point in time snapshot (using LVM tools) or disk image/copy (using AWS or Azure “external” tools) Pros Low costs Low costs Cons • Downtime: Long; • Duration: Long (slow backup since logical data needs to be extracted); • Performance impact: High (slows the disks and may stuck the machine on heavy used machines); • Data consistency: Intact; • Differential: Supported; • Sharding: Supported; • Downtime: OS and/or infrastructure depended; • Duration: Short (faster backup since only data blocks are copied); • Performance impact: Unknown (depends on OS and/or infrastructure); • Data consistency: Unknown state; • Differential: Infrastructure depended; • Sharding: Unsupported; 131 Sharding: is a type of database partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards. The word shard means a small part of a whole..
  • 132. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! SECOND Option – REPLICA SET Logical Backup Physical Backup Method mongodump Stop slave and copy its disk Pros • Downtime: None (backup is performed using Slave server – Master server is always up); • Duration: Not significant (backup is performed using Slave server); • Performance impact: None (backup is performed using Slave server – Master server is not impacted); • Data consistency: Intact; • Differential: Supported; • Sharding: Supported; • Downtime: None • Duration: Not significant Cons Very high costs – requires two additional servers. A slave server of the same type and size as the master server; and a small arbiter server (used as a secondary verification for Master server availability tests and “voting”). • Costs: Requires a dedicated server per replica set 132
  • 133. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! THIRD OPTION - MongoDB MMS • Part of the MongoDB Enterprise Edition or as a Cloud Service • The Cloud Service offer • $50/month/node • $2.5/GB/Month backup. • A valid go to market way of MongoDB for upsale • MMS Features • Point in time recovery • Daily snapshots • Detailed monitoring • Alerts 133
  • 134. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! How to Enable Incremental Backup • In Backup • Use the --oplog flag when doing mongodump • Dump each hour the local.oplog collection • In recovery • mongorestore --oplogReplay • applyOps to implement hourly dump 134
  • 135. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! mongostat
  • 136. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! mongotop
  • 137. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! db.serverStatus()
  • 138. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! db.stats() and db.collection.stats()
  • 139. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! rs.status()
  • 140. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! STORAGE ENGINES 140
  • 141. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MMAPv1
  • 142. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MongoDB 3.0 and WiredTiger • MongoDB version 3.0 supports new storage engine (WiredTiger): • Disk Compression • Heavy write • Document level locking • File per collection • Server wide selection: • config.yaml • launch w/ --storageEngine = wiredTiger 142
  • 143. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! MongoDB Pluggable Architecture
  • 144. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Engines Comparison
  • 145. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! YAML Based Configuration storage: dbPath: "/var/lib/mongodbwt" directoryPerDB: true engine: "wiredTiger" wiredTiger: engineConfig: cacheSizeGB: 16 journalCompressor: zlib directoryForIndexes: true collectionConfig: blockCompressor: zlib indexConfig: prefixCompression: true systemLog: destination: file path: "/var/log/mongodb/mongod.log" logAppend: true timeStampFormat: iso8601-local processManagement: fork: true pidFilePath: "/var/run/mongodb.pid" #security: # keyFile: "/etc/mongo.key" # authorization: "enabled" replication: replSetName: "arp0"
  • 146. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! SECURITY 146
  • 147. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Providing Permissions • use admin db.createUser( { user: "siteUserAdmin", pwd: "password", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } ) • use records db.createUser( { user: "recordsUserAdmin", pwd: "password", roles: [ { role: "userAdmin", db: "records" } ] } )
  • 148. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Roles Read readWrite dbAdmin dbOwner userAdmin clusterAdmin, clusterManager, … backup, restore readAnyDatabase, readWriteAnyDatabase root
  • 149. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Granular Actions use admin db.createRole( role: "manageOpRole", privileges: [ { resource: { cluster: true }, actions: [ "killop", "inprog" ] }, { resource: { db: "", collection: "" }, actions: [ "killCursors" ] } ], roles: [] } )
  • 150. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun! Thank You ! Moshe Kaplan [email protected] 054-2291978