SlideShare a Scribd company logo
@
#MDBlocal
How MongoDB 4.2 Pipeline
Powers Queries, Updates and Views
Asya Kamsky, Principal MongoDB Knowitall
asya999
AGGREGATION POWER++
SAN FRANCISCO
PREVIOUSLY ...
... 2017
#MDBW17
Analytics with MongoDB Aggregation Framework
@asya999 by Asya Kamsky,
Lead MongoDB Maven
PIPELINE POWER
STORE
RETRIEVE
#MDBLocal
ps ax | grep mongod | head 1
*nix command line pipe
PIPELINE
#MDBLocal
$match $group | $sort|
Input stream {} {} {} {} Result {} {} ...
PIPELINE
MongoDB document pipeline
DATA PIPELINE
STAGES
Stage 1 Stage 2 Stage 3 Stage 4
{} {} {} {}
{} {} {} {}
DATA PIPELINE
{} {} {} {}
{"$stage":{ ... }}
START
Collection
View
Special stage
STAGES
{title: "The Great Gatsby",
language: "English",
subjects: "Long Island"}
{title: "The Great Gatsby",
language: "English",
subjects: "New York"}
{title: "The Great Gatsby",
language: "English",
subjects: "1920s"}
{title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{"$match":{"language":"English"}}
$match
{ _id:"Long Island",
count: 1 },
$group
{ _id: "New York",
count: 2 },
$unwind
{ _id: "1920s",
count: 1 },
$sort $skip$limit $project
{"$unwind":"$subjects"}
{"$group":{"_id":"$subjects", "count":{"$sum:1}}
{ _id: "Harlem",
count: 1 },
{ _id: "Long Island",
count: 1 },
{ _id: "New York",
count: 2 },
{ _id: "1920s",
count: 1 },
{title: "Open City",
language: "English",
subjects: [
"New York"
"Harlem" ] }
{ title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{ title: "War and Peace",
language: "Russian",
subjects: [
"Russia",
"War of 1812",
"Napoleon"] },
{ title: "Open City",
language: "English",
subjects: [
"New York",
"Harlem" ] },
{title: "Open City",
language: "English",
subjects: "New York"}
{title: "Open City",
language: "English",
subjects: "Harlem"}
{ _id: "Harlem",
count: 1 },
{"$sort:{"count":-1} {"$limit":3}
{"$project":...}
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
$cursor
{title: "The Great Gatsby",
language: "English".
subjects: "Long Island"}
{title: "The Great Gatsby",
language: "English",
subjects: "New York"}
{title: "The Great Gatsby",
language: "English",
subjects: "1920s"}
{title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{"$match":{"language":"English"}}
$match
{ _id:"Long Island",
count: 1 },
$group
{ _id: "New York",
count: 2 },
$unwind
{ _id: "1920s",
count: 1 },
$sort $skip$limit $project
{"$unwind":"$subjects"}
{"$group":{"_id":"$subjects", "count":{"$sum:1}}
{ _id: "Harlem",
count: 1 },
{ _id: "Long Island",
count: 1 },
{ _id: "New York",
count: 2 },
{ _id: "1920s",
count: 1 },
{title: "Open City",
language: "English",
subjects: [
"New York"
"Harlem" ] }
{ title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{ title: "War and Peace",
language: "Russian",
subjects: [
"Russia",
"War of 1812",
"Napoleon"] },
{ title: "Open City",
language: "English",
subjects: [
"New York",
"Harlem" ] },
{title: "Open City",
language: "English",
subjects: "New York"}
{title: "Open City",
language: "English",
subjects: "Harlem"}
{ _id: "Harlem",
count: 1 },
{"$sort:{"count":-1} {"$limit":3}
{"$project":...}
$group $sort
1
#MDBLocal
INPUT STAGE RESULTSSTAGE
Each document is streamed through in RAM
STREAMING RESOURCE USE
#MDBLocal
INPUT STAGE RESULTSSTAGE
BLOCKING RESOURCE USE
Everything has to be kept in RAM (or spill)
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
$sort$match $group
start=ISODate("...")
end=ISODate("...")
{
user: "303900",
ipaddr: "71.56.112.56",
ts:ISODate("2017-05-08T...")
}
{$match:{ts:{$gte:start,$lt:end}}},
{$sort:{ts:1}},
{$group:{_id:"$user",ips:{$push:{ip:"$ipaddr", ts:"$ts"}},
diffIps:{$addToSet:"$ipaddr"}}},
{$match:{"diffIps.1":{$exists:true}}},
{$addFields:{diffs: {$filter:{
input:{$map:{
input: {$range:[0,{$subtract:[{$size:"$ips"},1]}]}, as:"i",
in:{$let:{vars:{ip1:{$arrayElemAt:["$ips","$$i"]},
ip2:{$arrayElemAt:["$ips",{$add:["$$i",1]}]}},
in:{
diff:{$cond:{
if:{$ne:["$$ip1.ip","$$ip2.ip"]},
then:{$divide:[{$subtract:["$$ip2.ts","$$ip1.ts"]},60000]},
else: 9999 }},
ip1:"$$ip1.ip", t1:"$$ip1.ts",
ip2:"$$ip2.ip", t2:"$$ip2.ts"
}}}}},
cond:{$lt:["$$this.diff",10]}
}}}},
{$match:{"diffs":{$ne:[]}}},
{$project:{_id:0, user:"$_id", suspectLogins:"$diffs"}}
{ "user" : "35237073",
"suspectLogins" : [
{"diff": 4.8333333333,
"ip1": "106.220.151.16",
"t1":"2017-05-08T06:58",
"ip2": "223.182.113.15"
"t2":"2017-05-08T07:03"
},
{"diff": 8.3,
"ip1": "223.182.113.15",
"t1":"2017-05-08T07:03",
"ip2": "49.206.217.26",
"t2":"2017-05-08T07:11"
}
]
} $match $addFields $match $project
https://github.com/asya999/mdbw17
5 minute review
https://github.com/asya999/mdbw17
#MDBW17
Analytics with MongoDB Aggregation Framework
@asya999 by Asya Kamsky,
Lead MongoDB Maven
PIPELINE POWER
https://github.com/asya999/mdbw17
PREVIOUSLY ...
... 2017
PREVIOUSLY ...
... 2017 ...
PREVIOUSLY ...
... 2017 ... 2018
#MDBLocal
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
#MDBLocal
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
#MDBLocal
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
#MDBLocal
THE FUTURE OF AGGREGATION
More options for output
Unify different languages
#MDBLocal
THE PRESENT OF AGGREGATION
More options for output
Unify different languages
#MDBLocal
Unify Different Languages
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
db.c.aggregate([
{$addFields:{
numChildren:{$size:"$children"},
numDependents:{$size:{
$filter:{
input:"$children.dep",
cond: "$$this"
}
}}
}},
...
])
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
db.c.aggregate([
{$addFields:{
numChildren:{$size:"$children"},
numDependents:{$size:{
$filter:{
input:"$children.dep",
cond: "$$this"
}
}}
}},
...
])
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
db.c.find (
{$expr:{
$lt:[
{$size:{$filter:{
input: "$children.dep",
cond: "$$this"
}}},
2
]
}}
)
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
UPDATE
db.c.find (
{$expr:{
$lt:[
{$size:{$filter:{
input: "$children.dep",
cond: "$$this"
}}},
2
]
}}
)
#MDBLocal
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
UPDATE
db.c.update(
{$expr:{
$anyElementTrue:{$map:{
input:"$children",
in: {$and:[
{$lt:["$$this.dob","1997-01-22"]},
"$$this.dep"
]}
}}
}},
{$set:{ audit:true }}
)
#MDBLocal
Update
db.coll.update(
<query>,
<update>,
<options>
)
#MDBLocal
Update
db.coll.update(
<query>,
<update>,
<options>
)
#MDBLocal
Update
db.coll.update(
<query>,
<update>,
<options>
)
<update>
#MDBLocal
Update
{
f1: <value>,
f2: <value>,
...
}
{
$set: { },
$inc: { },
$...
}
<update>
#MDBLocal
Update in 4.2
{ } OR [ ]
<update>
#MDBLocal
Update in 4.2
{ <same> } [ ]
<update>
#MDBLocal
Update in 4.2
{ <same> } [ <aggregation-pipeline> ]
<update>
Updates Using Aggregation
Pipeline
#MDBLocal
{ $addFields: { } }
{ $project: { } }
{ $replaceRoot: { } }
{ $set: { } }
{ $unset: [ ] }
{ $replaceWith: { } }
#MDBLocal
db.coll.update({_id:1},
{$inc:{a:1}},
{upsert:true})
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
"errmsg" : "Cannot apply to a value of
non-numeric type."
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
{ _id: 1, a: 1 }
db.coll.update({_id:1},
[ {$set:{a:{$sum:["$a",1]}}} ],
{upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
"errmsg" : "$add only supports
numeric or date types, not string"
db.coll.update({_id:1},
[ {$set:{a:{$add:["$a",1]}}} ],
{upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
db.coll.update({_id:1},
[ {$set:{a:{$ }} ],
{upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
db.coll.update({_id:1}, [ {$set:{a:{$cond:{
if: ,
then: , else: }}}}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
db.coll.update({_id:1}, [ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: , else: }}}}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
db.coll.update({_id:1}, [ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}}}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
db.coll.update({_id:1}, [ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}}}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}]} }}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1 }
db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}]} }}], {upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1 }
db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}]}, prev_a:"$a" }}],
{upsert:true})
#MDBLocal
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
{ _id:1, a: 21 }
{ _id: 1, a: 11, prev_a: 10 }
{ _id: 1, a: 100, prev_a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1, prev_a: "10" }
db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a",1]} }}]}, prev_a:"$a" }}],
{upsert:true})
#MDBLocal
Set Defaults
#MDBLocal
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
#MDBLocal
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{
}}
], {multi:true})
#MDBLocal
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
]}}
], {multi:true})
#MDBLocal
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
{ a:0, b:0, c:"unset" },
"$$ROOT"
]}}
], {multi:true})
#MDBLocal
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
{ a:0, b:0, c:"unset" },
"$$ROOT"
]}}
], {multi:true})
{_id: 1, a: 5, b: 12, c: "unset"}
{_id: 2, a: 15, b: 0, c: "abc"}
{_id: 3, a: 0, b: 99, c: "xyz"}
#MDBLocal
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y},
[ {$set:{h:{$cond:{
if:
then:
else:
}}}}],
{upsert:true})
#MDBLocal
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y},
[ {$set:{h:{$cond:{
if: {$in:[Z,"$h.hour"]},
then:{$map:{
input:"$h",
in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this",
else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}}
}}}},
else:{$concatArrays:["$h",[{hour:Z,value:VAL}]]}
}}}}],
{upsert:true})
if:
then:
else:
#MDBLocal
if:
then:
else:
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y}, [ {$set: {h:{$ifNull:["$h", [] ]}} },
{$set:{h:{$cond:{
if: {$in:[Z,"$h.hour"]},
then:{$map:{
input:"$h",
in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this",
else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}}
}}}},
else:{$concatArrays:["$h",[{hour:Z,value:VAL}]]}
}}}}],
{upsert:true})
#MDBLocal
Recap:
Updates can be specified with aggregation pipeline
All fields from existing document can be accessed
Slightly slower, but a lot more powerful
#MDBLocal
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
#MDBLocal
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
#MDBLocal
THE FUTURE OF AGGREGATION
More options for output
#MDBLocal
More Options for Output
#MDBLocal
Prior to MongoDB 4.2
$out
coll
new_coll
$out
#MDBLocal
Prior to MongoDB 4.2
$out
coll
new_coll
$out
db.coll.aggregate( [ {pipeline}, ...
{$out: "new_coll"} ]);
#MDBLocal
Prior to MongoDB 4.2
$out
coll
new_coll
$out
db.coll.aggregate( [ {pipeline}, ...
{$out: "new_coll"} ]);
new_coll
○ must be unsharded
○ overwrites existing
New $merge stage
in aggregation pipeline
#MDBLocal
MongoDB 4.2
$merge
coll
coll2
$merge
#MDBLocal
MongoDB 4.2
$merge
db.coll.aggregate( [
{pipeline}, ...,
{$merge: { ... }
]);
coll
coll2
$merge
#MDBLocal
MongoDB 4.2
$merge
db.coll.aggregate( [
{pipeline}, ...,
{$merge: { ... }
]);
coll2
can exist
same or different 'db'
can be sharded
coll
coll2
$merge
#MDBLocal
coll
coll2
$merge
{ } { } { } { }
{ } { } { } { }
MongoDB 4.2
#MDBLocal
{
$merge: {
into: <target>
}
}
$merge syntax
#MDBLocal
{$merge: "collection2"}
$merge syntax
{
$merge: {
into: <target>
}
}
#MDBLocal
{$merge: {into: {db: "db2", coll: "collection2"}}
$merge syntax
{
$merge: {
into: <target>
}
}
#MDBLocal
{
$merge: {
into: <target>
}
}
$merge syntax
#MDBLocal
{
$merge: {
into: <target>,
on: <fields>
}
}
on: "_id"
on: [ "_id", "shardkey(s)" ]
must be unique
$merge syntax
#MDBLocal
{
$merge: {
into: <target>,
on: <fields>
}
}
$merge syntax
#MDBLocal
Actions
source target
#MDBLocal
Actions
nothing matched:
source target
#MDBLocal
Actions
nothing matched: usually insert
source target
#MDBLocal
Actions
nothing matched: usually insert
document matched:
source target
#MDBLocal
Actions
nothing matched: usually insert
document matched: overwrite? update? ???
source target
#MDBLocal
Actions
nothing matched: usually insert
document matched: update
source target
#MDBLocal
Actions
nothing matched: usually insert
document matched: update (merge)
source target
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:
whenMatched:
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert",
whenMatched:
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert",
whenMatched:"merge"
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert"|"discard"|"fail",
whenMatched:"merge"
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert"|"discard"|"fail",
whenMatched:"merge"|"replace"|"keepExisting"|"fail"|[...]
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[...]
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[<custom pipeline>]
}
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$addFields:{
}}
]
}
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$addFields:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
}
#MDBLocal
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
_id: "37",
total: 309,
f1: "yyy"
}
#MDBLocal
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
#MDBLocal
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
}
#MDBLocal
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
_id: "37",
total: 309,
f1: "x"
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[...]
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
let: { ... },
whenMatched:[ ...]
}
}
#MDBLocal
$merge syntax
{
$merge: {
into: <target>,
let: {new: "$$ROOT"},
whenMatched:[ ...]
}
}
EXAMPLES
APPEND from TEMP collection
#MDBLocal
temp
real
data
real
Using $merge to append loaded and
cleansed records loaded into db
#MDBLocal
aggregate 'temp' and append valid records to 'data'
db.temp.aggregate( [
{ ... } /* pipeline to massage and cleanse data in temp */,
{$merge:{
into: "data",
whenMatched: "fail"
}}
]);
#MDBLocal
aggregate 'temp' and append valid records to 'data'
db.temp.aggregate( [
{ ... } /* pipeline to massage and cleanse data in temp */,
{$merge:{
into: "data",
whenMatched: "fail"
}}
]);
Similar to SQL's INSERT INTO T1 SELECT * from T2
EXAMPLES
Maintain Single View
#MDBLocal
mflix
users
users
mfriendbook
users
sv
Using $merge to populate/update
user fields from other services
#MDBLocal
mflix
users
users
mfriendbook
users
sv
Using $merge to populate/update
user fields from other services
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy"
}
#MDBLocal
$merge updates fields from mflix.users collection into
sv.users collection. Our "_id" field is unique username
mflix_pipeline = [
{ "$project" : {
"_id" : "$username",
"mflix" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mflix)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy"
}
#MDBLocal
$merge updates fields from mflix.users collection into
sv.users collection. Our "_id" field is unique username
mflix_pipeline = [
{ "$project" : {
"_id" : "$username",
"mflix" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mflix) db.users.aggregate(mflix_pipeline)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... }
}
#MDBLocal
$merge updates fields from mfriendbook.users collection into
sv.users collection. Our "_id" field is unique username
mfriendbook_pipeline = [
{ "$project" : {
"_id" : "$username",
"mfriendbook" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mfriendbook)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... }
}
#MDBLocal
$merge updates fields from mfriendbook.users collection into
sv.users collection. Our "_id" field is unique username
mfriendbook_pipeline = [
{ "$project" : {
"_id" : "$username",
"mfriendbook" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mfriendbook) db.users.aggregate(mfriendbook_pipeline)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... },
mfriendbook: { ... }
}
EXAMPLES
Populate ROLLUPS into summary table
registrations
real
regsummary
real
Using $merge to incrementally
update periodic rollups in summary
#MDBLocal
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
#MDBLocal
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
db.registrations.aggregate([
{$match: {event_id: "MDBW19"}},
{$group:{
_id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}},
count: {$sum:1}
}},
{$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
#MDBLocal
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
db.registrations.aggregate([
{$match: {event_id: "MDBW19"}},
{$group:{
_id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}},
count: {$sum:1}
}},
{$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
{ "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 }
{ "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 }
{ "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
#MDBLocal
$merge to incrementally update periodic rollups in
summary collection (for single day)
#MDBLocal
$merge to incrementally update periodic rollups in
summary collection (for single day)
db.registrations.aggregate([
{$match: {
event_id: "MDBW19",
date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")}
}},
{$count: "total"},
{$addFields: {event:"MDBW19", "date":"2019-05-22"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
#MDBLocal
$merge to incrementally update periodic rollups in
summary collection (for single day)
db.registrations.aggregate([
{$match: {
event_id: "MDBW19",
date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")}
}},
{$count: "total"},
{$addFields: {event:"MDBW19", "date":"2019-05-22"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
{ "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 }
{ "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 }
{ "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
{ "event" : "MDBW19", "date" : "2019-05-22", "total" : 34 }
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++

More Related Content

What's hot (20)

PPTX
Aggregation Framework
MongoDB
 
PPTX
How to leverage what's new in MongoDB 3.6
Maxime Beugnet
 
PDF
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
PPTX
Aggregation in MongoDB
Kishor Parkhe
 
PDF
MongoDB Aggregation Framework
Caserta
 
KEY
MongoDB Aggregation Framework
Tyler Brock
 
PDF
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB
 
PDF
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB
 
PPTX
Webinar: Exploring the Aggregation Framework
MongoDB
 
PPTX
The Aggregation Framework
MongoDB
 
PPTX
Agg framework selectgroup feb2015 v2
MongoDB
 
PPTX
MongoDB World 2016 : Advanced Aggregation
Joe Drumgoole
 
PPTX
Powerful Analysis with the Aggregation Pipeline
MongoDB
 
PDF
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
PPTX
The Aggregation Framework
MongoDB
 
PDF
Embedding a language into string interpolator
Michael Limansky
 
PDF
Mongodb Aggregation Pipeline
zahid-mian
 
PPTX
MongoDB Aggregation
Amit Ghosh
 
PDF
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB
 
ODP
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
Aggregation Framework
MongoDB
 
How to leverage what's new in MongoDB 3.6
Maxime Beugnet
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
Aggregation in MongoDB
Kishor Parkhe
 
MongoDB Aggregation Framework
Caserta
 
MongoDB Aggregation Framework
Tyler Brock
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB
 
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB
 
Webinar: Exploring the Aggregation Framework
MongoDB
 
The Aggregation Framework
MongoDB
 
Agg framework selectgroup feb2015 v2
MongoDB
 
MongoDB World 2016 : Advanced Aggregation
Joe Drumgoole
 
Powerful Analysis with the Aggregation Pipeline
MongoDB
 
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
The Aggregation Framework
MongoDB
 
Embedding a language into string interpolator
Michael Limansky
 
Mongodb Aggregation Pipeline
zahid-mian
 
MongoDB Aggregation
Amit Ghosh
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB
 
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 

Similar to MongoDB .local San Francisco 2020: Aggregation Pipeline Power++ (20)

PDF
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB
 
PPTX
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
MongoDB
 
PPTX
MongoDB 3.2 - Analytics
Massimo Brignoli
 
PPTX
Introduction to MongoDB at IGDTUW
Ankur Raina
 
KEY
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
Daniel Cousineau
 
PPTX
Querying mongo db
Bogdan Sabău
 
PPTX
Introduction to MongoDB
Anton Fil
 
PDF
Mongo db
Toki Kanno
 
ODP
MongoDB San Francisco DrupalCon 2010
Karoly Negyesi
 
PDF
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
MongoDB
 
PPTX
Query for json databases
Binh Le
 
PPTX
Mongo db 101 dc group
John Ragan
 
PPTX
MongoDb In Action
fuchaoqun
 
PDF
Full metal mongo
Israel Gutiérrez
 
PDF
MongoDB
Hemant Kumar Tiwary
 
PDF
Querying Mongo Without Programming Using Funql
MongoDB
 
PPTX
mongodb-aggregation-may-2012
Chris Westin
 
PDF
Latinoware
kchodorow
 
PDF
Mongo indexes
Mehmet Çetin
 
PDF
Mongo DB schema design patterns
joergreichert
 
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB
 
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
MongoDB
 
MongoDB 3.2 - Analytics
Massimo Brignoli
 
Introduction to MongoDB at IGDTUW
Ankur Raina
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
Daniel Cousineau
 
Querying mongo db
Bogdan Sabău
 
Introduction to MongoDB
Anton Fil
 
Mongo db
Toki Kanno
 
MongoDB San Francisco DrupalCon 2010
Karoly Negyesi
 
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
MongoDB
 
Query for json databases
Binh Le
 
Mongo db 101 dc group
John Ragan
 
MongoDb In Action
fuchaoqun
 
Full metal mongo
Israel Gutiérrez
 
Querying Mongo Without Programming Using Funql
MongoDB
 
mongodb-aggregation-may-2012
Chris Westin
 
Latinoware
kchodorow
 
Mongo indexes
Mehmet Çetin
 
Mongo DB schema design patterns
joergreichert
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
PDF
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB
 
Ad

Recently uploaded (20)

PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 

MongoDB .local San Francisco 2020: Aggregation Pipeline Power++

  • 1. @ #MDBlocal How MongoDB 4.2 Pipeline Powers Queries, Updates and Views Asya Kamsky, Principal MongoDB Knowitall asya999 AGGREGATION POWER++ SAN FRANCISCO
  • 3. #MDBW17 Analytics with MongoDB Aggregation Framework @asya999 by Asya Kamsky, Lead MongoDB Maven PIPELINE POWER
  • 5. #MDBLocal ps ax | grep mongod | head 1 *nix command line pipe PIPELINE
  • 6. #MDBLocal $match $group | $sort| Input stream {} {} {} {} Result {} {} ... PIPELINE MongoDB document pipeline
  • 8. Stage 1 Stage 2 Stage 3 Stage 4 {} {} {} {} {} {} {} {} DATA PIPELINE {} {} {} {} {"$stage":{ ... }} START Collection View Special stage STAGES
  • 9. {title: "The Great Gatsby", language: "English", subjects: "Long Island"} {title: "The Great Gatsby", language: "English", subjects: "New York"} {title: "The Great Gatsby", language: "English", subjects: "1920s"} {title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, {"$match":{"language":"English"}} $match { _id:"Long Island", count: 1 }, $group { _id: "New York", count: 2 }, $unwind { _id: "1920s", count: 1 }, $sort $skip$limit $project {"$unwind":"$subjects"} {"$group":{"_id":"$subjects", "count":{"$sum:1}} { _id: "Harlem", count: 1 }, { _id: "Long Island", count: 1 }, { _id: "New York", count: 2 }, { _id: "1920s", count: 1 }, {title: "Open City", language: "English", subjects: [ "New York" "Harlem" ] } { title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, { title: "War and Peace", language: "Russian", subjects: [ "Russia", "War of 1812", "Napoleon"] }, { title: "Open City", language: "English", subjects: [ "New York", "Harlem" ] }, {title: "Open City", language: "English", subjects: "New York"} {title: "Open City", language: "English", subjects: "Harlem"} { _id: "Harlem", count: 1 }, {"$sort:{"count":-1} {"$limit":3} {"$project":...}
  • 12. {title: "The Great Gatsby", language: "English". subjects: "Long Island"} {title: "The Great Gatsby", language: "English", subjects: "New York"} {title: "The Great Gatsby", language: "English", subjects: "1920s"} {title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, {"$match":{"language":"English"}} $match { _id:"Long Island", count: 1 }, $group { _id: "New York", count: 2 }, $unwind { _id: "1920s", count: 1 }, $sort $skip$limit $project {"$unwind":"$subjects"} {"$group":{"_id":"$subjects", "count":{"$sum:1}} { _id: "Harlem", count: 1 }, { _id: "Long Island", count: 1 }, { _id: "New York", count: 2 }, { _id: "1920s", count: 1 }, {title: "Open City", language: "English", subjects: [ "New York" "Harlem" ] } { title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, { title: "War and Peace", language: "Russian", subjects: [ "Russia", "War of 1812", "Napoleon"] }, { title: "Open City", language: "English", subjects: [ "New York", "Harlem" ] }, {title: "Open City", language: "English", subjects: "New York"} {title: "Open City", language: "English", subjects: "Harlem"} { _id: "Harlem", count: 1 }, {"$sort:{"count":-1} {"$limit":3} {"$project":...} $group $sort 1
  • 13. #MDBLocal INPUT STAGE RESULTSSTAGE Each document is streamed through in RAM STREAMING RESOURCE USE
  • 14. #MDBLocal INPUT STAGE RESULTSSTAGE BLOCKING RESOURCE USE Everything has to be kept in RAM (or spill)
  • 20. $sort$match $group start=ISODate("...") end=ISODate("...") { user: "303900", ipaddr: "71.56.112.56", ts:ISODate("2017-05-08T...") } {$match:{ts:{$gte:start,$lt:end}}}, {$sort:{ts:1}}, {$group:{_id:"$user",ips:{$push:{ip:"$ipaddr", ts:"$ts"}}, diffIps:{$addToSet:"$ipaddr"}}}, {$match:{"diffIps.1":{$exists:true}}}, {$addFields:{diffs: {$filter:{ input:{$map:{ input: {$range:[0,{$subtract:[{$size:"$ips"},1]}]}, as:"i", in:{$let:{vars:{ip1:{$arrayElemAt:["$ips","$$i"]}, ip2:{$arrayElemAt:["$ips",{$add:["$$i",1]}]}}, in:{ diff:{$cond:{ if:{$ne:["$$ip1.ip","$$ip2.ip"]}, then:{$divide:[{$subtract:["$$ip2.ts","$$ip1.ts"]},60000]}, else: 9999 }}, ip1:"$$ip1.ip", t1:"$$ip1.ts", ip2:"$$ip2.ip", t2:"$$ip2.ts" }}}}}, cond:{$lt:["$$this.diff",10]} }}}}, {$match:{"diffs":{$ne:[]}}}, {$project:{_id:0, user:"$_id", suspectLogins:"$diffs"}} { "user" : "35237073", "suspectLogins" : [ {"diff": 4.8333333333, "ip1": "106.220.151.16", "t1":"2017-05-08T06:58", "ip2": "223.182.113.15" "t2":"2017-05-08T07:03" }, {"diff": 8.3, "ip1": "223.182.113.15", "t1":"2017-05-08T07:03", "ip2": "49.206.217.26", "t2":"2017-05-08T07:11" } ] } $match $addFields $match $project
  • 23. #MDBW17 Analytics with MongoDB Aggregation Framework @asya999 by Asya Kamsky, Lead MongoDB Maven PIPELINE POWER https://github.com/asya999/mdbw17
  • 27. #MDBLocal THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 28. #MDBLocal THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 29. #MDBLocal THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 30. #MDBLocal THE FUTURE OF AGGREGATION More options for output Unify different languages
  • 31. #MDBLocal THE PRESENT OF AGGREGATION More options for output Unify different languages
  • 33. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION
  • 34. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION db.c.aggregate([ {$addFields:{ numChildren:{$size:"$children"}, numDependents:{$size:{ $filter:{ input:"$children.dep", cond: "$$this" } }} }}, ... ])
  • 35. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND db.c.aggregate([ {$addFields:{ numChildren:{$size:"$children"}, numDependents:{$size:{ $filter:{ input:"$children.dep", cond: "$$this" } }} }}, ... ])
  • 36. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND db.c.find ( {$expr:{ $lt:[ {$size:{$filter:{ input: "$children.dep", cond: "$$this" }}}, 2 ] }} )
  • 37. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND UPDATE db.c.find ( {$expr:{ $lt:[ {$size:{$filter:{ input: "$children.dep", cond: "$$this" }}}, 2 ] }} )
  • 38. #MDBLocal Unify Different Languages {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND UPDATE db.c.update( {$expr:{ $anyElementTrue:{$map:{ input:"$children", in: {$and:[ {$lt:["$$this.dob","1997-01-22"]}, "$$this.dep" ]} }} }}, {$set:{ audit:true }} )
  • 43. #MDBLocal Update in 4.2 { } OR [ ] <update>
  • 44. #MDBLocal Update in 4.2 { <same> } [ ] <update>
  • 45. #MDBLocal Update in 4.2 { <same> } [ <aggregation-pipeline> ] <update>
  • 47. #MDBLocal { $addFields: { } } { $project: { } } { $replaceRoot: { } } { $set: { } } { $unset: [ ] } { $replaceWith: { } }
  • 48. #MDBLocal db.coll.update({_id:1}, {$inc:{a:1}}, {upsert:true}) { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } "errmsg" : "Cannot apply to a value of non-numeric type."
  • 49. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } { _id: 1, a: 1 } db.coll.update({_id:1}, [ {$set:{a:{$sum:["$a",1]}}} ], {upsert:true})
  • 50. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } "errmsg" : "$add only supports numeric or date types, not string" db.coll.update({_id:1}, [ {$set:{a:{$add:["$a",1]}}} ], {upsert:true})
  • 51. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$ }} ], {upsert:true})
  • 52. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: , then: , else: }}}}], {upsert:true})
  • 53. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: , else: }}}}], {upsert:true})
  • 54. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}}}], {upsert:true})
  • 55. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}}}], {upsert:true})
  • 56. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}]} }}], {upsert:true})
  • 57. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } { _id:1, a: 1 } db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}]} }}], {upsert:true})
  • 58. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } { _id:1, a: 1 } db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}]}, prev_a:"$a" }}], {upsert:true})
  • 59. #MDBLocal { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } { _id:1, a: 21 } { _id: 1, a: 11, prev_a: 10 } { _id: 1, a: 100, prev_a: 100 } { _id:1, a: 21 } { _id:1, a: 1, prev_a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$min:[100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a",1]} }}]}, prev_a:"$a" }}], {upsert:true})
  • 61. #MDBLocal Set Defaults {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset"
  • 62. #MDBLocal Set Defaults {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{ }} ], {multi:true})
  • 63. #MDBLocal Set Defaults {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ ]}} ], {multi:true})
  • 64. #MDBLocal Set Defaults {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ { a:0, b:0, c:"unset" }, "$$ROOT" ]}} ], {multi:true})
  • 65. #MDBLocal Set Defaults {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ { a:0, b:0, c:"unset" }, "$$ROOT" ]}} ], {multi:true}) {_id: 1, a: 5, b: 12, c: "unset"} {_id: 2, a: 15, b: 0, c: "abc"} {_id: 3, a: 0, b: 99, c: "xyz"}
  • 66. #MDBLocal { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set:{h:{$cond:{ if: then: else: }}}}], {upsert:true})
  • 67. #MDBLocal { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set:{h:{$cond:{ if: {$in:[Z,"$h.hour"]}, then:{$map:{ input:"$h", in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this", else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}} }}}}, else:{$concatArrays:["$h",[{hour:Z,value:VAL}]]} }}}}], {upsert:true}) if: then: else:
  • 68. #MDBLocal if: then: else: { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set: {h:{$ifNull:["$h", [] ]}} }, {$set:{h:{$cond:{ if: {$in:[Z,"$h.hour"]}, then:{$map:{ input:"$h", in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this", else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}} }}}}, else:{$concatArrays:["$h",[{hour:Z,value:VAL}]]} }}}}], {upsert:true})
  • 69. #MDBLocal Recap: Updates can be specified with aggregation pipeline All fields from existing document can be accessed Slightly slower, but a lot more powerful
  • 70. #MDBLocal THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 71. #MDBLocal THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 72. #MDBLocal THE FUTURE OF AGGREGATION More options for output
  • 74. #MDBLocal Prior to MongoDB 4.2 $out coll new_coll $out
  • 75. #MDBLocal Prior to MongoDB 4.2 $out coll new_coll $out db.coll.aggregate( [ {pipeline}, ... {$out: "new_coll"} ]);
  • 76. #MDBLocal Prior to MongoDB 4.2 $out coll new_coll $out db.coll.aggregate( [ {pipeline}, ... {$out: "new_coll"} ]); new_coll ○ must be unsharded ○ overwrites existing
  • 77. New $merge stage in aggregation pipeline
  • 79. #MDBLocal MongoDB 4.2 $merge db.coll.aggregate( [ {pipeline}, ..., {$merge: { ... } ]); coll coll2 $merge
  • 80. #MDBLocal MongoDB 4.2 $merge db.coll.aggregate( [ {pipeline}, ..., {$merge: { ... } ]); coll2 can exist same or different 'db' can be sharded coll coll2 $merge
  • 81. #MDBLocal coll coll2 $merge { } { } { } { } { } { } { } { } MongoDB 4.2
  • 84. #MDBLocal {$merge: {into: {db: "db2", coll: "collection2"}} $merge syntax { $merge: { into: <target> } }
  • 86. #MDBLocal { $merge: { into: <target>, on: <fields> } } on: "_id" on: [ "_id", "shardkey(s)" ] must be unique $merge syntax
  • 87. #MDBLocal { $merge: { into: <target>, on: <fields> } } $merge syntax
  • 91. #MDBLocal Actions nothing matched: usually insert document matched: source target
  • 92. #MDBLocal Actions nothing matched: usually insert document matched: overwrite? update? ??? source target
  • 93. #MDBLocal Actions nothing matched: usually insert document matched: update source target
  • 94. #MDBLocal Actions nothing matched: usually insert document matched: update (merge) source target
  • 95. #MDBLocal $merge syntax { $merge: { into: <target>, whenNotMatched: whenMatched: } }
  • 96. #MDBLocal $merge syntax { $merge: { into: <target>, whenNotMatched:"insert", whenMatched: } }
  • 97. #MDBLocal $merge syntax { $merge: { into: <target>, whenNotMatched:"insert", whenMatched:"merge" } }
  • 98. #MDBLocal $merge syntax { $merge: { into: <target>, whenNotMatched:"insert"|"discard"|"fail", whenMatched:"merge" } }
  • 99. #MDBLocal $merge syntax { $merge: { into: <target>, whenNotMatched:"insert"|"discard"|"fail", whenMatched:"merge"|"replace"|"keepExisting"|"fail"|[...] } }
  • 100. #MDBLocal $merge syntax { $merge: { into: <target>, whenMatched:[...] } }
  • 101. #MDBLocal $merge syntax { $merge: { into: <target>, whenMatched:[<custom pipeline>] } }
  • 102. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$addFields:{ }} ] } }
  • 103. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$addFields:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 104. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 105. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 106. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { }
  • 107. #MDBLocal $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { _id: "37", total: 309, f1: "yyy" }
  • 108. #MDBLocal $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } }
  • 109. #MDBLocal $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { }
  • 110. #MDBLocal $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { _id: "37", total: 309, f1: "x" }
  • 111. #MDBLocal $merge syntax { $merge: { into: <target>, whenMatched:[...] } }
  • 112. #MDBLocal $merge syntax { $merge: { into: <target>, let: { ... }, whenMatched:[ ...] } }
  • 113. #MDBLocal $merge syntax { $merge: { into: <target>, let: {new: "$$ROOT"}, whenMatched:[ ...] } }
  • 115. #MDBLocal temp real data real Using $merge to append loaded and cleansed records loaded into db
  • 116. #MDBLocal aggregate 'temp' and append valid records to 'data' db.temp.aggregate( [ { ... } /* pipeline to massage and cleanse data in temp */, {$merge:{ into: "data", whenMatched: "fail" }} ]);
  • 117. #MDBLocal aggregate 'temp' and append valid records to 'data' db.temp.aggregate( [ { ... } /* pipeline to massage and cleanse data in temp */, {$merge:{ into: "data", whenMatched: "fail" }} ]); Similar to SQL's INSERT INTO T1 SELECT * from T2
  • 119. #MDBLocal mflix users users mfriendbook users sv Using $merge to populate/update user fields from other services
  • 120. #MDBLocal mflix users users mfriendbook users sv Using $merge to populate/update user fields from other services sv.users { _id: "user253", dob: ISODate(...), f1: "yyy" }
  • 121. #MDBLocal $merge updates fields from mflix.users collection into sv.users collection. Our "_id" field is unique username mflix_pipeline = [ { "$project" : { "_id" : "$username", "mflix" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mflix) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy" }
  • 122. #MDBLocal $merge updates fields from mflix.users collection into sv.users collection. Our "_id" field is unique username mflix_pipeline = [ { "$project" : { "_id" : "$username", "mflix" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mflix) db.users.aggregate(mflix_pipeline) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... } }
  • 123. #MDBLocal $merge updates fields from mfriendbook.users collection into sv.users collection. Our "_id" field is unique username mfriendbook_pipeline = [ { "$project" : { "_id" : "$username", "mfriendbook" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mfriendbook) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... } }
  • 124. #MDBLocal $merge updates fields from mfriendbook.users collection into sv.users collection. Our "_id" field is unique username mfriendbook_pipeline = [ { "$project" : { "_id" : "$username", "mfriendbook" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mfriendbook) db.users.aggregate(mfriendbook_pipeline) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... }, mfriendbook: { ... } }
  • 126. registrations real regsummary real Using $merge to incrementally update periodic rollups in summary
  • 127. #MDBLocal $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true});
  • 128. #MDBLocal $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true}); db.registrations.aggregate([ {$match: {event_id: "MDBW19"}}, {$group:{ _id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}}, count: {$sum:1} }}, {$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ])
  • 129. #MDBLocal $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true}); db.registrations.aggregate([ {$match: {event_id: "MDBW19"}}, {$group:{ _id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}}, count: {$sum:1} }}, {$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ]) { "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 } { "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 } { "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
  • 130. #MDBLocal $merge to incrementally update periodic rollups in summary collection (for single day)
  • 131. #MDBLocal $merge to incrementally update periodic rollups in summary collection (for single day) db.registrations.aggregate([ {$match: { event_id: "MDBW19", date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")} }}, {$count: "total"}, {$addFields: {event:"MDBW19", "date":"2019-05-22"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ])
  • 132. #MDBLocal $merge to incrementally update periodic rollups in summary collection (for single day) db.registrations.aggregate([ {$match: { event_id: "MDBW19", date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")} }}, {$count: "total"}, {$addFields: {event:"MDBW19", "date":"2019-05-22"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ]) { "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 } { "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 } { "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 } { "event" : "MDBW19", "date" : "2019-05-22", "total" : 34 }