SlideShare a Scribd company logo
MongoDB
How to model and extract your data
whoami
Francesco Lo Franco
Software developer
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
@__kekko
it.linkedin.com/in/francescolofranco/
What is MongoDB?
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
MongoDB
is an open source database
that uses a
document-oriented
data model.
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
MongoDB Data Model
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
MongoDB uses a Json-like
representation of his data
(Bson)
Bson > Json
● custom types (Date, ObjectID...)
● faster
● lightweight
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
collections
documents
key-value pairs
MongoDB Data Model example (BLOG POST):
{
"_id": ObjectId("508d27069cc1ae293b36928d"),
"title": "This is the title",
"tags": [
"chocolate",
"milk"
],
"created_date": ISODate("2012-10-28T12:41:39.110Z"),
"author_id": ObjectId("508d280e9cc1ae293b36928e"),
"comments": [
{
"content": "This is the body of comment",
"author_id": ObjectId("508d34"),
"tag": "coffee"},
{
"content": "This is the body of comment",
"author_id": ObjectId("508d35")}
]
}
MongoDB Data Model example (BLOG POST):
{
"_id": ObjectId("508d27069cc1ae293b36928d"),
"title": "This is the title",
"tags": [
"chocolate",
"milk"
],
"created_date": ISODate("2012-10-28T12:41:39.110Z"),
"author_id": ObjectId("508d280e9cc1ae293b36928e"),
"comments": [
{
"content": "This is the body of comment",
"author_id": ObjectId("508d34"),
"tag": "coffee"},
{
"content": "This is the body of comment",
"author_id": ObjectId("508d35")}
]
}
MongoDB Data Model example (BLOG POST):
{
"_id": ObjectId("508d27069cc1ae293b36928d"),
"title": "This is the title",
"tags": [
"chocolate",
"milk"
],
"created_date": ISODate("2012-10-28T12:41:39.110Z"),
"author_id": ObjectId("508d280e9cc1ae293b36928e"),
"comments": [
{
"content": "This is the body of comment",
"author_id": ObjectId("508d34"),
"tag": "coffee"},
{
"content": "This is the body of comment",
"author_id": ObjectId("508d35")}
]
}
REFERENCING
vs
EMBEDDING
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
One to few
> db.employee.findOne()
{
name: 'Kate Monster',
ssn: '123-456-7890',
addresses:
[{ street: 'Lombard Street, 26',
zip_code: '22545'
},
{ street: 'Abbey Road, 99',
zip_code: '33568'
}]
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Disadvantages:
- It’s really hard accessing the embedded
details as stand-alone entities
example:
“Show all addresses with a certain zip code”
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Advantages:
- One query to get them all
- embedded + value object =
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
One to many
> db.parts.findOne()
{
_id: ObjectID('AAAAF17CD2AAAAAAF17CD2'),
partno: '123-aff-456',
name: '#4 grommet',
qty: 94,
cost: 0.94,
price: 3.99
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
One to many
> db.products.findOne()
{
name: 'smoke shifter',
manufacturer: 'Acme Corp',
catalog_number: 1234,
parts: [
ObjectID('AAAAF17CD2AAAAAAF17CD2AA'),
ObjectID('F17CD2AAAAAAF17CD2AAAAAA'),
ObjectID('D2AAAAAAF17CD2AAAAAAF17C'),
// etc
]
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Disadvantages:
“find all parts that compose a product”
> product = db.products.findOne({
catalog_number: 1234
});
> product_parts = db.parts.find({
_id: { $in : product.parts }
} ).toArray() ;
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
DENORMALIZATION
Advantages:
- Easy to search and update an individual
referenced document (a single part)
- free N-to-N schema without join table
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
parts: [
ObjectID('AAAAF17CD2AAAAAAF17CD2AA'),
ObjectID('F17CD2AAAAAAF17CD2AAAAAA'),
ObjectID('D2AAAAAAF17CD2AAAAAAF17C')
]
One to squillions
(Logging)
- document limit size = 16M
- can be reached even if the referencing
array contains only the objectId field
(~ 1,300,000 references)
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Parent Referencing
> db.hosts.findOne()
{
_id: ObjectID('AAAB'),
name: 'goofy.example.com',
ipaddr: '127.66.66.66'
}
> db.logmsg.findOne()
{
time: ISODate("2014-03-28T09:42:41.382Z"),
message: 'cpu is on fire!',
host: ObjectID('AAAB')
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Disadvantages:
“find the most recent 5K messages for a host”
> host = db.hosts.findOne({
ipaddr : '127.66.66.66'
});
> last_5k_msg = db.logmsg.find({
host: host._id})
.sort({time : -1})
.limit(5000)
.toArray()
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
DENORMALIZATION
DENORMALIZATION
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
NORMALIZATION
To be denormalized
> db.products.findOne()
{
name: 'smoke shifter',
manufacturer: 'Acme Corp',
catalog_number: 1234,
parts: [
ObjectID('AAAA'),
ObjectID('F17C'),
ObjectID('D2AA'),
// etc
]
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Denormalized (partial + one side)
> db.products.findOne()
{
name: 'smoke shifter',
manufacturer: 'Acme Corp',
catalog_number: 1234,
parts: [
{ id: ObjectID('AAAA'), name: 'part1'},
{ id: ObjectID('F17C'), name: 'part2'},
{ id: ObjectID('D2AA'), name: 'part3'},
// etc
]
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Advantages:
- Easy query to get product part name
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Disadvantages:
- Updates become more expensive
- Cannot assure atomic and isolated
updates
MongoDB
it’s not
A.C.I.D. compliant
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
MongoDB supports only
single document level
transaction
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
So, how can I have an
(almost)
ACID Mongo?
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
1. Two Phase Commit (A+C)
2. $isolate operator (I)
3. enable journaling (D)
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit (A+C)
If we make a multi-update, a
system failure between the 2
separate updates can bring to
unrecoverable inconsistency
Create a transaction document
tracking all the needed data
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Uses a bridge “transaction”
document for retrying/rollback
operations not completed due to a
system failure
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
TODO: transfer 100$ from A to B
Account A:
total: 1000,
on_going_transactions: [];
Account B:
total: 500,
on_going_transactions: [];
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Transaction document
from: “A”,
to: “B”,
amount: 100,
status: “initial”,
datetime: New Date();
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 1: Update the transaction
_id: “zzzz”
from: “A”,
to: “B”,
amount: 100,
status: “pending”,
datetime: New Date();
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 2: Update Account A
update total: -100;
push on_going_transactions:
{transaction where _id = “zzzz”}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 3: Update Account B
update total: +100;
push on_going_transactions:
{transaction where _id = “zzzz”}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 4: Update the transaction
_id: “zzzz”
from: “A”,
to: “B”,
amount: 100,
status: “applied”,
datetime: New Date();
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 5: Update Account A
pull on_going_transactions:
{transaction where _id = “zzzz”}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 6: Update Account B
pull on_going_transactions:
{transaction where _id = “zzzz”}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit Example
Step 7: Update the transaction
_id: “zzzz”
from: “A”,
to: “B”,
amount: 100,
status: “done”,
datetime: New Date();
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Two Phase Commit
This pattern emulates the
sql transaction
management, achieving
Atomicity + Consistency
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
$isolate operator (I)
“You can ensure that no client sees
the changes until the operation
completes or errors out.”
db.car.update(
{ color : "RED" , $isolated : 1 },
{ $inc : { count : 1 } },
{ multi: true }
)
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Journaling (D)
Journaling is logging all writes
(every 100ms) for recovering
purpose in case of system failure
(crash)
If a clean shutdown is
accomplished, journal files are
erased
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Aggregation Framework
(finally)
def: “Aggregations are
operations that process data
records and return
computed results.”
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Aggregation Framework
1) C.R.U.D.
2) single purpose
aggregation operators
3) pipeline
4) map reduce
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Aggregation Framework
CRUD Operators:
- insert()
- find() / findOne()
- update()
- remove()
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Aggregation Framework
1) C.R.U.D.
2) single purpose
aggregation operators
3) pipeline
4) map reduce
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
SPAO
a) count
b) distinct
c) group
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
count
{ a: 1, b: 0 }
{ a: 1, b: 1 }
{ a: 1, b: 4 }
{ a: 2, b: 2 }
db.records.count( { a: 1 } ) = 3
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
distinct
{ name: "jim", age: 0 }
{ name: "kim", age: 1 }
{ name: "dim", age: 4 }
{ name: "sim", age: 2 }
db.foe.distinct("age")=[0, 1, 4, 2]
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
group
{ age: 12, count: 4 }
{ age: 12, count: 2 }
{ age: 14, count: 3 }
{ age: 14, count: 4 }
{ age: 16, count: 6 }
{ age: 18, count: 8 }
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
group
db.records.group({
key: { age: 1 },
cond: { age: { $lt: 16 } },
reduce:
function(cur,result)
{ result.count += cur.count },
initial: { count: 0 }
})
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
group
[
{ age: 12, count: 6 },
{ age: 14, count: 7 }
]
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Aggregation Framework
1) C.R.U.D.
2) single purpose
aggregation operators
3) pipeline
4) map reduce
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline
“Documents enter a multi-
stage pipeline that
transforms the documents
into an aggregated results”
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline
initial_doc $group
result1 $match
... ... ...
... ... ...
resultN $project
final
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Example
> db.logs.findOne()
{
_id: ObjectId('a23ad345frt4'),
os: 'android',
token_id: 'ds2f43s4df',
at: ISODate("2012-10-28T12:41:39.110Z"),
event: “something just happened”,
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
“We need logs to be grouped by os, and
count how many in a single day
interval, sort by time”
Pipeline Example
Expected result:
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
os: 'android',
date: {
'year': 2012,
'month': 10
'day': 28
},
count: 125
Pipeline Example
$collection->aggregate(
array(
array('$project' => array(
'os' => 1,
'days' => array(
'year' => array('$year' => '$at'),
'month' => array('$month' => '$at'),
'day' => array('$dayOfMonth' => '$at')
)
)),
...
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Example
...
array(
'$group' => array(
'_id' => array(
'os' => '$os',
'date' => '$days',
),
'count' => array('$sum' => 1)
)
)
),
...
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Example
...
array(
'$sort' => array(
'_id.date' => 1
)
)
)
);
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Optimization
…
{ $limit: 100 },
{ $skip: 5 },
{ $limit: 10 },
{ $skip: 2 }
...
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Optimization
…
{ $limit: 100 },
{ $limit: 15 },
{ $skip: 5 },
{ $skip: 2 }
...
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Pipeline Optimization
…
{ $limit: 15 },
{ $skip: 7 }
...
Aggregation Framework
1) C.R.U.D.
2) single purpose
aggregation operators
3) pipeline
4) map reduce
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce
“Map reduce is a data
processing paradigm for
condensing large volumes of
data into useful aggregated
results”
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
> db.orders.find()
{ sku: “01A”, qty: 8, total: 88 },
{ sku: “01A”, qty: 7, total: 79 },
{ sku: “02B”, qty: 9, total: 27 },
{ sku: “03C”, qty: 8, total: 24 },
{ sku: “03C”, qty: 3, total: 12 }
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
“Calculate the avg price we sell
our products, grouped by sku
code, with total quantity and
total income, starting from
1/1/2015”
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
db.orders.mapReduce(
mapFunction,
reduceFunction,
{
out: { merge: "reduced_orders" },
query: {
date:{ $gt: new Date('01/01/2015') }
},
finalize: finalizeFunction
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
var mapFunction =
function() {
var key = this.sku;
var value = {
tot: this.total
qty: this.qty
};
emit(key, value);
}
Result:
{ 01A: [{tot: 88, qty: 8}, {tot: 79, qty: 7}] },
{ 02B: {tot: 27, qty: 9} },
{ 03C: [{tot: 24, qty: 8}, {tot: 12, qty: 3}] }
Map Reduce Example
db.orders.mapReduce(
mapFunction,
reduceFunction,
{
out: { merge: "reduced_orders" },
query: {
date:{ $gt: new Date('01/01/2015') }
},
finalize: finalizeFunction
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
var reduceFunction =
reducedVal = { qty: 0, tot: 0}
function(key, values) {
for(var i, i < values.length, i++) {
reducedVal.qty += values[i].qty
reducedVal.tot += values[i].tot
};
return reducedVal;
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
Result:
{ 01A: {tot: 167, qty: 15} },
{ 02B: {tot: 27, qty: 9} },
{ 03C: {tot: 36, qty: 11} }
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
db.orders.mapReduce(
mapFunction,
reduceFunction,
{
out: { merge: "reduced_orders" },
query: {
date:{ $gt: new Date('01/01/2015') }
},
finalize: finalizeFunction
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
var finalizeFunction =
function(key, reducedVal) {
reducedVal.avg =
reducedVal.tot/reducedVal.qty;
return reducedVal;
};
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
Result:
{01A: {tot: 167, qty: 15, avg: 11.13} },
{02B: {tot: 27, qty: 9, avg: 3} },
{03C: {tot: 36, qty: 11, avg: 3.27} }
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
db.orders.mapReduce(
mapFunction,
reduceFunction,
{
out: { merge: "reduced_orders" },
query: {
date:{ $gt: new Date('01/01/2015') }
},
finalize: finalizeFunction
}
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
Map Reduce Example
> db.reduced_orders.find()
{01A: {tot: 167, qty: 15, avg: 11.13} },
{02B: {tot: 27, qty: 9, avg: 3} },
{03C: {tot: 36, qty: 11, avg: 3.27} }
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
thanks
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
References:
➔ http://docs.mongodb.org/manual
➔ http://blog.mongodb.org/post/87200945828/
➔ http://thejackalofjavascript.com/mapreduce-in-mongodb/
Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework

More Related Content

What's hot (19)

PDF
Paytm integration in swift
InnovationM
 
PPT
AJAX
Gouthaman V
 
PDF
JavaCro'15 - GWT integration with Vaadin - Peter Lehto
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
PPTX
MongDB Mobile: Bringing the Power of MongoDB to Your Device
Matt Lord
 
PPT
Aug Xml Net Forum Dynamics Integration
MariAnne Woehrle
 
PPTX
APIs, APIs Everywhere!
BIWUG
 
PPT
Ken 20150306 心得分享
LearningTech
 
PDF
Jsonix - Talking to OGC Web Services in JSON
orless
 
PPTX
MongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB
 
PPTX
[MongoDB.local Bengaluru 2018] Using Change Streams to Keep Up With Your Data
MongoDB
 
PPTX
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB
 
PPTX
13 networking, mobile services, and authentication
WindowsPhoneRocks
 
PDF
Mate
devaraj ns
 
PPT
JSON Rules Language
giurca
 
PPTX
MongoDB.local Atlanta: MongoDB Mobile: Bringing the Power of MongoDB to Your ...
MongoDB
 
PDF
Do something in 5 with gas 9-copy between databases with oauth2
Bruce McPherson
 
PDF
Gdg dev fest hybrid apps your own mini-cordova
Ayman Mahfouz
 
PPTX
WP7 HUB_Consuming Data Services
MICTT Palma
 
PPTX
Android data binding
Sergi Martínez
 
Paytm integration in swift
InnovationM
 
JavaCro'15 - GWT integration with Vaadin - Peter Lehto
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
MongDB Mobile: Bringing the Power of MongoDB to Your Device
Matt Lord
 
Aug Xml Net Forum Dynamics Integration
MariAnne Woehrle
 
APIs, APIs Everywhere!
BIWUG
 
Ken 20150306 心得分享
LearningTech
 
Jsonix - Talking to OGC Web Services in JSON
orless
 
MongoDB Mobile: Bringing the Power of MongoDB to Your Device
MongoDB
 
[MongoDB.local Bengaluru 2018] Using Change Streams to Keep Up With Your Data
MongoDB
 
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB
 
13 networking, mobile services, and authentication
WindowsPhoneRocks
 
JSON Rules Language
giurca
 
MongoDB.local Atlanta: MongoDB Mobile: Bringing the Power of MongoDB to Your ...
MongoDB
 
Do something in 5 with gas 9-copy between databases with oauth2
Bruce McPherson
 
Gdg dev fest hybrid apps your own mini-cordova
Ayman Mahfouz
 
WP7 HUB_Consuming Data Services
MICTT Palma
 
Android data binding
Sergi Martínez
 

Similar to MongoDB - How to model and extract your data (20)

PPTX
MongoDB_ppt.pptx
1AP18CS037ShirishKul
 
PDF
MongoDB.pdf
KuldeepKumar778733
 
ODP
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
PDF
Building your first app with MongoDB
Norberto Leite
 
PDF
From SQL to MongoDB
Nuxeo
 
PDF
MongoDB FabLab León
Juan Antonio Roy Couto
 
PDF
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
PDF
2012 mongo db_bangalore_roadmap_new
MongoDB
 
PPT
MongoDB Pros and Cons
johnrjenson
 
PPTX
Python mongo db-training-europython-2011
Andreas Jung
 
PDF
MongoDB Meetup
Maxime Beugnet
 
KEY
Mongodb intro
christkv
 
PDF
Using MongoDB and Python
Mike Bright
 
PDF
2016 feb-23 pyugre-py_mongo
Michael Bright
 
PDF
Mongo DB schema design patterns
joergreichert
 
PPT
9. Document Oriented Databases
Fabio Fumarola
 
PPTX
Introduction to MongoDB
Raghunath A
 
PPTX
MongoDB 3.0
Victoria Malaya
 
PDF
Quick overview on mongo db
Eman Mohamed
 
PDF
Mongo db for C# Developers
Simon Elliston Ball
 
MongoDB_ppt.pptx
1AP18CS037ShirishKul
 
MongoDB.pdf
KuldeepKumar778733
 
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
Building your first app with MongoDB
Norberto Leite
 
From SQL to MongoDB
Nuxeo
 
MongoDB FabLab León
Juan Antonio Roy Couto
 
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
2012 mongo db_bangalore_roadmap_new
MongoDB
 
MongoDB Pros and Cons
johnrjenson
 
Python mongo db-training-europython-2011
Andreas Jung
 
MongoDB Meetup
Maxime Beugnet
 
Mongodb intro
christkv
 
Using MongoDB and Python
Mike Bright
 
2016 feb-23 pyugre-py_mongo
Michael Bright
 
Mongo DB schema design patterns
joergreichert
 
9. Document Oriented Databases
Fabio Fumarola
 
Introduction to MongoDB
Raghunath A
 
MongoDB 3.0
Victoria Malaya
 
Quick overview on mongo db
Eman Mohamed
 
Mongo db for C# Developers
Simon Elliston Ball
 
Ad

Recently uploaded (20)

PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
Ad

MongoDB - How to model and extract your data

  • 1. MongoDB How to model and extract your data
  • 2. whoami Francesco Lo Franco Software developer Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework @__kekko it.linkedin.com/in/francescolofranco/
  • 3. What is MongoDB? Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 4. MongoDB is an open source database that uses a document-oriented data model. Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 5. MongoDB Data Model Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 6. MongoDB uses a Json-like representation of his data (Bson) Bson > Json ● custom types (Date, ObjectID...) ● faster ● lightweight Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 7. Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework collections documents key-value pairs
  • 8. MongoDB Data Model example (BLOG POST): { "_id": ObjectId("508d27069cc1ae293b36928d"), "title": "This is the title", "tags": [ "chocolate", "milk" ], "created_date": ISODate("2012-10-28T12:41:39.110Z"), "author_id": ObjectId("508d280e9cc1ae293b36928e"), "comments": [ { "content": "This is the body of comment", "author_id": ObjectId("508d34"), "tag": "coffee"}, { "content": "This is the body of comment", "author_id": ObjectId("508d35")} ] }
  • 9. MongoDB Data Model example (BLOG POST): { "_id": ObjectId("508d27069cc1ae293b36928d"), "title": "This is the title", "tags": [ "chocolate", "milk" ], "created_date": ISODate("2012-10-28T12:41:39.110Z"), "author_id": ObjectId("508d280e9cc1ae293b36928e"), "comments": [ { "content": "This is the body of comment", "author_id": ObjectId("508d34"), "tag": "coffee"}, { "content": "This is the body of comment", "author_id": ObjectId("508d35")} ] }
  • 10. MongoDB Data Model example (BLOG POST): { "_id": ObjectId("508d27069cc1ae293b36928d"), "title": "This is the title", "tags": [ "chocolate", "milk" ], "created_date": ISODate("2012-10-28T12:41:39.110Z"), "author_id": ObjectId("508d280e9cc1ae293b36928e"), "comments": [ { "content": "This is the body of comment", "author_id": ObjectId("508d34"), "tag": "coffee"}, { "content": "This is the body of comment", "author_id": ObjectId("508d35")} ] }
  • 11. REFERENCING vs EMBEDDING Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 12. One to few > db.employee.findOne() { name: 'Kate Monster', ssn: '123-456-7890', addresses: [{ street: 'Lombard Street, 26', zip_code: '22545' }, { street: 'Abbey Road, 99', zip_code: '33568' }] } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 13. Disadvantages: - It’s really hard accessing the embedded details as stand-alone entities example: “Show all addresses with a certain zip code” Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 14. Advantages: - One query to get them all - embedded + value object = Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 15. One to many > db.parts.findOne() { _id: ObjectID('AAAAF17CD2AAAAAAF17CD2'), partno: '123-aff-456', name: '#4 grommet', qty: 94, cost: 0.94, price: 3.99 } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 16. One to many > db.products.findOne() { name: 'smoke shifter', manufacturer: 'Acme Corp', catalog_number: 1234, parts: [ ObjectID('AAAAF17CD2AAAAAAF17CD2AA'), ObjectID('F17CD2AAAAAAF17CD2AAAAAA'), ObjectID('D2AAAAAAF17CD2AAAAAAF17C'), // etc ] } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 17. Disadvantages: “find all parts that compose a product” > product = db.products.findOne({ catalog_number: 1234 }); > product_parts = db.parts.find({ _id: { $in : product.parts } } ).toArray() ; Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework DENORMALIZATION
  • 18. Advantages: - Easy to search and update an individual referenced document (a single part) - free N-to-N schema without join table Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework parts: [ ObjectID('AAAAF17CD2AAAAAAF17CD2AA'), ObjectID('F17CD2AAAAAAF17CD2AAAAAA'), ObjectID('D2AAAAAAF17CD2AAAAAAF17C') ]
  • 19. One to squillions (Logging) - document limit size = 16M - can be reached even if the referencing array contains only the objectId field (~ 1,300,000 references) Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 20. Parent Referencing > db.hosts.findOne() { _id: ObjectID('AAAB'), name: 'goofy.example.com', ipaddr: '127.66.66.66' } > db.logmsg.findOne() { time: ISODate("2014-03-28T09:42:41.382Z"), message: 'cpu is on fire!', host: ObjectID('AAAB') } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 21. Disadvantages: “find the most recent 5K messages for a host” > host = db.hosts.findOne({ ipaddr : '127.66.66.66' }); > last_5k_msg = db.logmsg.find({ host: host._id}) .sort({time : -1}) .limit(5000) .toArray() Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework DENORMALIZATION
  • 22. DENORMALIZATION Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework NORMALIZATION
  • 23. To be denormalized > db.products.findOne() { name: 'smoke shifter', manufacturer: 'Acme Corp', catalog_number: 1234, parts: [ ObjectID('AAAA'), ObjectID('F17C'), ObjectID('D2AA'), // etc ] } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 24. Denormalized (partial + one side) > db.products.findOne() { name: 'smoke shifter', manufacturer: 'Acme Corp', catalog_number: 1234, parts: [ { id: ObjectID('AAAA'), name: 'part1'}, { id: ObjectID('F17C'), name: 'part2'}, { id: ObjectID('D2AA'), name: 'part3'}, // etc ] } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 25. Advantages: - Easy query to get product part name Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 26. Disadvantages: - Updates become more expensive - Cannot assure atomic and isolated updates MongoDB it’s not A.C.I.D. compliant Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 27. MongoDB supports only single document level transaction Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 28. So, how can I have an (almost) ACID Mongo? Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 29. 1. Two Phase Commit (A+C) 2. $isolate operator (I) 3. enable journaling (D) Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 30. Two Phase Commit (A+C) If we make a multi-update, a system failure between the 2 separate updates can bring to unrecoverable inconsistency Create a transaction document tracking all the needed data Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 31. Two Phase Commit Example Uses a bridge “transaction” document for retrying/rollback operations not completed due to a system failure Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 32. Two Phase Commit Example TODO: transfer 100$ from A to B Account A: total: 1000, on_going_transactions: []; Account B: total: 500, on_going_transactions: []; Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 33. Two Phase Commit Example Transaction document from: “A”, to: “B”, amount: 100, status: “initial”, datetime: New Date(); Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 34. Two Phase Commit Example Step 1: Update the transaction _id: “zzzz” from: “A”, to: “B”, amount: 100, status: “pending”, datetime: New Date(); Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 35. Two Phase Commit Example Step 2: Update Account A update total: -100; push on_going_transactions: {transaction where _id = “zzzz”} Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 36. Two Phase Commit Example Step 3: Update Account B update total: +100; push on_going_transactions: {transaction where _id = “zzzz”} Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 37. Two Phase Commit Example Step 4: Update the transaction _id: “zzzz” from: “A”, to: “B”, amount: 100, status: “applied”, datetime: New Date(); Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 38. Two Phase Commit Example Step 5: Update Account A pull on_going_transactions: {transaction where _id = “zzzz”} Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 39. Two Phase Commit Example Step 6: Update Account B pull on_going_transactions: {transaction where _id = “zzzz”} Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 40. Two Phase Commit Example Step 7: Update the transaction _id: “zzzz” from: “A”, to: “B”, amount: 100, status: “done”, datetime: New Date(); Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 41. Two Phase Commit This pattern emulates the sql transaction management, achieving Atomicity + Consistency Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 42. $isolate operator (I) “You can ensure that no client sees the changes until the operation completes or errors out.” db.car.update( { color : "RED" , $isolated : 1 }, { $inc : { count : 1 } }, { multi: true } ) Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 43. Journaling (D) Journaling is logging all writes (every 100ms) for recovering purpose in case of system failure (crash) If a clean shutdown is accomplished, journal files are erased Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 44. Aggregation Framework (finally) def: “Aggregations are operations that process data records and return computed results.” Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 45. Aggregation Framework 1) C.R.U.D. 2) single purpose aggregation operators 3) pipeline 4) map reduce Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 46. Aggregation Framework CRUD Operators: - insert() - find() / findOne() - update() - remove() Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 47. Aggregation Framework 1) C.R.U.D. 2) single purpose aggregation operators 3) pipeline 4) map reduce Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 48. SPAO a) count b) distinct c) group Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 49. count { a: 1, b: 0 } { a: 1, b: 1 } { a: 1, b: 4 } { a: 2, b: 2 } db.records.count( { a: 1 } ) = 3 Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 50. distinct { name: "jim", age: 0 } { name: "kim", age: 1 } { name: "dim", age: 4 } { name: "sim", age: 2 } db.foe.distinct("age")=[0, 1, 4, 2] Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 51. group { age: 12, count: 4 } { age: 12, count: 2 } { age: 14, count: 3 } { age: 14, count: 4 } { age: 16, count: 6 } { age: 18, count: 8 } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 52. group db.records.group({ key: { age: 1 }, cond: { age: { $lt: 16 } }, reduce: function(cur,result) { result.count += cur.count }, initial: { count: 0 } }) Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 53. group [ { age: 12, count: 6 }, { age: 14, count: 7 } ] Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 54. Aggregation Framework 1) C.R.U.D. 2) single purpose aggregation operators 3) pipeline 4) map reduce Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 55. Pipeline “Documents enter a multi- stage pipeline that transforms the documents into an aggregated results” Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 56. Pipeline initial_doc $group result1 $match ... ... ... ... ... ... resultN $project final Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 57. Pipeline Example > db.logs.findOne() { _id: ObjectId('a23ad345frt4'), os: 'android', token_id: 'ds2f43s4df', at: ISODate("2012-10-28T12:41:39.110Z"), event: “something just happened”, } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework “We need logs to be grouped by os, and count how many in a single day interval, sort by time”
  • 58. Pipeline Example Expected result: Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework os: 'android', date: { 'year': 2012, 'month': 10 'day': 28 }, count: 125
  • 59. Pipeline Example $collection->aggregate( array( array('$project' => array( 'os' => 1, 'days' => array( 'year' => array('$year' => '$at'), 'month' => array('$month' => '$at'), 'day' => array('$dayOfMonth' => '$at') ) )), ... Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 60. Pipeline Example ... array( '$group' => array( '_id' => array( 'os' => '$os', 'date' => '$days', ), 'count' => array('$sum' => 1) ) ) ), ... Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 61. Pipeline Example ... array( '$sort' => array( '_id.date' => 1 ) ) ) ); Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 62. Pipeline Optimization … { $limit: 100 }, { $skip: 5 }, { $limit: 10 }, { $skip: 2 } ... Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 63. Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework Pipeline Optimization … { $limit: 100 }, { $limit: 15 }, { $skip: 5 }, { $skip: 2 } ...
  • 64. Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework Pipeline Optimization … { $limit: 15 }, { $skip: 7 } ...
  • 65. Aggregation Framework 1) C.R.U.D. 2) single purpose aggregation operators 3) pipeline 4) map reduce Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 66. Map Reduce “Map reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results” Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 67. Map Reduce Example > db.orders.find() { sku: “01A”, qty: 8, total: 88 }, { sku: “01A”, qty: 7, total: 79 }, { sku: “02B”, qty: 9, total: 27 }, { sku: “03C”, qty: 8, total: 24 }, { sku: “03C”, qty: 3, total: 12 } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 68. Map Reduce Example “Calculate the avg price we sell our products, grouped by sku code, with total quantity and total income, starting from 1/1/2015” Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 69. Map Reduce Example db.orders.mapReduce( mapFunction, reduceFunction, { out: { merge: "reduced_orders" }, query: { date:{ $gt: new Date('01/01/2015') } }, finalize: finalizeFunction } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 70. Map Reduce Example var mapFunction = function() { var key = this.sku; var value = { tot: this.total qty: this.qty }; emit(key, value); } Result: { 01A: [{tot: 88, qty: 8}, {tot: 79, qty: 7}] }, { 02B: {tot: 27, qty: 9} }, { 03C: [{tot: 24, qty: 8}, {tot: 12, qty: 3}] }
  • 71. Map Reduce Example db.orders.mapReduce( mapFunction, reduceFunction, { out: { merge: "reduced_orders" }, query: { date:{ $gt: new Date('01/01/2015') } }, finalize: finalizeFunction } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 72. Map Reduce Example var reduceFunction = reducedVal = { qty: 0, tot: 0} function(key, values) { for(var i, i < values.length, i++) { reducedVal.qty += values[i].qty reducedVal.tot += values[i].tot }; return reducedVal; Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 73. Map Reduce Example Result: { 01A: {tot: 167, qty: 15} }, { 02B: {tot: 27, qty: 9} }, { 03C: {tot: 36, qty: 11} } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 74. Map Reduce Example db.orders.mapReduce( mapFunction, reduceFunction, { out: { merge: "reduced_orders" }, query: { date:{ $gt: new Date('01/01/2015') } }, finalize: finalizeFunction } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 75. Map Reduce Example var finalizeFunction = function(key, reducedVal) { reducedVal.avg = reducedVal.tot/reducedVal.qty; return reducedVal; }; Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 76. Map Reduce Example Result: {01A: {tot: 167, qty: 15, avg: 11.13} }, {02B: {tot: 27, qty: 9, avg: 3} }, {03C: {tot: 36, qty: 11, avg: 3.27} } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 77. Map Reduce Example db.orders.mapReduce( mapFunction, reduceFunction, { out: { merge: "reduced_orders" }, query: { date:{ $gt: new Date('01/01/2015') } }, finalize: finalizeFunction } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 78. Map Reduce Example > db.reduced_orders.find() {01A: {tot: 167, qty: 15, avg: 11.13} }, {02B: {tot: 27, qty: 9, avg: 3} }, {03C: {tot: 36, qty: 11, avg: 3.27} } Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 79. thanks Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework
  • 80. References: ➔ http://docs.mongodb.org/manual ➔ http://blog.mongodb.org/post/87200945828/ ➔ http://thejackalofjavascript.com/mapreduce-in-mongodb/ Francesco Lo Franco - @__kekko | MongoDB Aggregation Framework