SlideShare a Scribd company logo
Solutions Architect, MongoDB
Marc Schwering
#MongoDBBasics @MongoDB @m4rcsch
Applikationsentwicklung mit MongoDB
Reporting & Aggregation
2
• Recap from last session
• Reporting / Analytics options
• Map Reduce
• Aggregation Framework introduction
– Aggregation explain
• mycms application reports
• Geospatial with Aggregation Framework
• Text Search with Aggregation Framework
Agenda
3
• Virtual Genius Bar
– Use the chat to post
questions
– EMEA Solution
Architecture / Support
team are on hand
– Make use of them
during the sessions!!!
Q & A
Recap from last time….
5
Indexing
• Indexes
• Multikey, compound,
‘dot.notation’
• Covered, sorting
• Text, GeoSpatial
• Btrees
>db.articles.ensureIndex( { author
: 1, tags : 1 } )
>db.user.find({user:"danr"}, {_id:0,
password:1})
>db.articles.ensureIndex( {
location: “2dsphere” } )
>>db.articles.ensureIndex(
{ "$**" : “text”,
name : “TextIndex”} )
options db.col.ensureIndex({ key : type})
6
Index performance / efficiency
• Examine index plans
• Identity slow queries
• n / nscanned ratio
• Which index used.
operators .explain() , db profiler
> db.articles.find(
{author:'Dan Roberts’})
.sort({date:-1}
).explain()
> db.setProfilingLevel(1,
100)
{ "was" : 0, "slowms" : 100,
"ok" : 1 }
> db.system.profile.find()
.pretty()
Reporting / Analytics options
8
• Query Language
– Leverage pre aggregated documents
• Aggregation Framework
– Calculate new values from the data that we have
– For instance : Average views, comments count
• MapReduce
– Internal Javascript based implementation
– External Hadoop, using the MongoDB connector
• A combination of the above
Access data for reporting, options
9
• Immediate results
– Simple from a query
perspective.
– Interactions collection
Pre Aggregated Reports
{
‘_id’ : ObjectId(..),
‘article_id’ : ObjectId(..),
‘section’ : ‘schema’,
‘date’ : ISODate(..),
‘daily’: { ‘views’ : 45,
‘comments’ : 150 }
‘hours’ : {
0 : { ‘views’ : 10 },
1 : { ‘views’ : 2 },
…
23 : { ‘views’ : 14,
‘comments’ : 10 }
}
}
> db.interactions.find(
{"article_id" : ObjectId(”…..")},
{_id:0, hourly:1}
)
10
• Use query result to display directly in application
– Create new REST API
– D3.js library or similar in UI
Pre Aggregated Reports
{
"hourly" : {
"0" : {
"view" : 1
},
"1" : {
"view" : 1
},
……
"22" : {
"view" : 5
},
"23" : {
"view" : 3
}
}
}
Map Reduce
12
• Map Reduce
– MongoDB – JavaScript
• Incremental Map Reduce
Map Reduce
//Map Reduce Example
> db.articles.mapReduce(
function() { emit(this.author, this.comment_count); },
function(key, values) { return Array.sum (values) },
{
query : {},
out: { merge: "comment_count" }
}
)
Output
{ "_id" : "Dan Roberts", "value" : 6 }
{ "_id" : "Jim Duffy", "value" : 1 }
{ "_id" : "Kunal Taneja", "value" : 2 }
{ "_id" : "Paul Done", "value" : 2 }
13
MongoDB – Hadoop Connector
Hadoop Integration
Primary
Secondary
Secondary
HDFS
Primary
Secondary
Secondary
Primary
Secondary
Secondary
Primary
Secondary
Secondary
HDFS HDFS HDFS
MapReduce MapReduce MapReduce MapReduce
MongoS MongoSMongoS
Application ApplicationApplication
Application
Dash Boards /
Reporting
1) Data Flow,
Input /
Output via
Application
Tier
Aggregation Framework
15
• Multi-stage pipeline
– Like a unix pipe –
• “ps -ef | grep mongod”
– Aggregate data, Transform
documents
– Implemented in the core server
Aggregation Framework
//Find out which are the most popular tags…
db.articles.aggregate([
{ $unwind : "$tags" },
{ $group : { _id : "$tags" , number : { $sum : 1 } } },
{ $sort : { number : -1 } }
])
Output
{ "_id" : "mongodb", "number" : 6 }
{ "_id" : "nosql", "number" : 3 }
{ "_id" : "database", "number" : 1 }
{ "_id" : "aggregation", "number" : 1 }
{ "_id" : "node", "number" : 1 }
16
In our mycms application..
//Our new python example
@app.route('/cms/api/v1.0/tag_counts', methods=['GET'])
def tag_counts():
pipeline = [ { "$unwind" : "$tags" },
{ "$group" : { "_id" : "$tags" , "number" : { "$sum" : 1 } }
},
{ "$sort" : { "number" : -1 } }]
cur = db['articles'].aggregate(pipeline, cursor={})
# Check everything ok
if not cur:
abort(400)
# iterate the cursor and add docs to a dict
tags = [tag for tag in cur]
return jsonify({'tags' : json.dumps(tags, default=json_util.default)})
17
• Pipeline and Expression operators
Aggregation operators
Pipeline
$match
$sort
$limit
$skip
$project
$unwind
$group
$geoNear
$text
$search
Tip: Other operators for date, time, boolean and string manipulation
Expression
$addToSet
$first
$last
$max
$min
$avg
$push
$sum
Arithmetic
$add
$divide
$mod
$multiply
$subtract
Conditional
$cond
$ifNull
Variables
$let
$map
18
• What reports and analytics do we need in our
application?
– Popular Tags
– Popular Articles
– Popular Locations – integration with Geo Spatial
– Average views per hour or day
Application Reports
19
• Unwind each ‘tags’ array
• Group and count each one, then Sort
• Output to new collection
– Query from new collection so don’t need to compute for
every request.
Popular Tags
db.articles.aggregate([
{ $unwind : "$tags" },
{ $group : { _id : "$tags" , number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $out : "tags"}
])
20
• Top 5 articles by average daily views
– Use the $avg operator
– Use use $match to constrain data range
• Utilise with $gt and $lt operators
Popular Articles
db.interactions.aggregate([
{
{$match : { date :
{ $gt : ISODate("2014-02-20T00:00:00.000Z")}}},
{$group : {_id: "$article_id", a : { $avg : "$daily.view"}}},
{$sort : { a : -1}},
{$limit : 5}
]);
21
• Use Explain plan to ensure the efficient use of the
index when querying.
Aggregation Framework Explain
db.interactions.aggregate([
{$group : {_id: "$article_id", a : { $avg : "$daily.view"}}},
{$sort : { a : -1}},
{$limit : 5}
],
{explain : true}
);
22
Explain output…
{
"stages" : [
{
"$cursor" : { "query" : … }, "fields" : { … },
"plan" : {
"cursor" : "BasicCursor",
"isMultiKey" : false,
"scanAndOrder" : false,
"allPlans" : [
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"scanAndOrder" : false
}
]
}
}
},
…
"ok" : 1
}
Geo Spatial & Text Search
Aggregation
24
• $text operator with aggregation framework
– All articles with MongoDB
– Group by author, sort by comments count
Text Search
db.articles.aggregate([
{ $match: { $text: { $search: "mongodb" } } },
{ $group: { _id: "$author", comments:
{ $sum: "$comment_count" } } }
{$sort : {comments: -1}},
])
25
• $geoNear operator with aggregation framework
– Again use geo operator in the $match statement.
– Group by author, and article count.
Utilise with Geo spatial
db.articles.aggregate([
{ $match: { location: { $geoNear :
{ $geometry :
{ type: "Point" ,coordinates : [-0.128, 51.507] } },
$maxDistance :5000}
}
},
{ $group: { _id: "$author", articleCount: { $sum: 1 } } }
])
Summary
27
• Aggregating Data…
– Map Reduce
– Hadoop
– Pre-Aggregated Reports
– Aggregation Framework
• Tune with Explain plan
• Compute on the fly or Compute and store
• Geospatial
• Text Search
Summary
28
– MongoDB World Recap!
– Preview into the operation Series.
Next Session – 9th July
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation

More Related Content

PPTX
1403 app dev series - session 5 - analytics
MongoDB
 
PPTX
MongoDB Aggregation
Amit Ghosh
 
PDF
NoSQL meets Microservices - Michael Hackstein
distributed matters
 
PDF
MongoDB Aggregation Framework
Caserta
 
PPTX
Query for json databases
Binh Le
 
PPTX
MongoDB - Aggregation Pipeline
Jason Terpko
 
PDF
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
NoSQLmatters
 
PDF
Hadoop - MongoDB Webinar June 2014
MongoDB
 
1403 app dev series - session 5 - analytics
MongoDB
 
MongoDB Aggregation
Amit Ghosh
 
NoSQL meets Microservices - Michael Hackstein
distributed matters
 
MongoDB Aggregation Framework
Caserta
 
Query for json databases
Binh Le
 
MongoDB - Aggregation Pipeline
Jason Terpko
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
NoSQLmatters
 
Hadoop - MongoDB Webinar June 2014
MongoDB
 

What's hot (19)

PDF
MongoDB With Style
Gabriele Lana
 
PDF
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
PPTX
Aggregation in MongoDB
Kishor Parkhe
 
PDF
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
PPTX
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
PPTX
2014 bigdatacamp asya_kamsky
Data Con LA
 
ODP
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
PDF
Mongodb Aggregation Pipeline
zahid-mian
 
PDF
Mongoskin - Guilin
Jackson Tian
 
ODP
ELK Stack - Turn boring logfiles into sexy dashboard
Georg Sorst
 
PPTX
Aggregation Framework
MongoDB
 
PPTX
Webinar: Exploring the Aggregation Framework
MongoDB
 
KEY
MongoDB Aggregation Framework
Tyler Brock
 
KEY
CouchDB on Android
Sven Haiges
 
PPTX
The Aggregation Framework
MongoDB
 
PDF
MongoDB Performance Tuning
Puneet Behl
 
PPTX
The Aggregation Framework
MongoDB
 
PPTX
Agg framework selectgroup feb2015 v2
MongoDB
 
MongoDB With Style
Gabriele Lana
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB
 
Aggregation in MongoDB
Kishor Parkhe
 
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
2014 bigdatacamp asya_kamsky
Data Con LA
 
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
Mongodb Aggregation Pipeline
zahid-mian
 
Mongoskin - Guilin
Jackson Tian
 
ELK Stack - Turn boring logfiles into sexy dashboard
Georg Sorst
 
Aggregation Framework
MongoDB
 
Webinar: Exploring the Aggregation Framework
MongoDB
 
MongoDB Aggregation Framework
Tyler Brock
 
CouchDB on Android
Sven Haiges
 
The Aggregation Framework
MongoDB
 
MongoDB Performance Tuning
Puneet Behl
 
The Aggregation Framework
MongoDB
 
Agg framework selectgroup feb2015 v2
MongoDB
 
Ad

Viewers also liked (19)

PDF
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Rackspace
 
PPTX
Analytic innovation transforming instagram data into predicitive analytics wi...
suresh sood
 
PDF
MongoDB - How to model and extract your data
Francesco Lo Franco
 
PPTX
Realtime Analytics with MongoDB Counters (mongonyc 2012)
Scott Hernandez
 
PDF
Analytic Data Report with MongoDB
Li Jia Li
 
PDF
Mongo db aggregation guide
Deysi Gmarra
 
PDF
Social Data and Log Analysis Using MongoDB
Takahiro Inoue
 
PPT
MongoDB Tick Data Presentation
MongoDB
 
PDF
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
PDF
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Formant
 
PDF
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB
 
PDF
MongoDB Europe 2016 - Using MongoDB to Build a Fast and Scalable Content Repo...
MongoDB
 
PDF
Data Processing and Aggregation with MongoDB
MongoDB
 
PDF
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB
 
PPTX
MongoDB for Spatio-Behavioral Data Analysis and Visualization
MongoDB
 
PDF
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB
 
PPTX
MongoDB for Time Series Data: Schema Design
MongoDB
 
PPTX
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
PPTX
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Rackspace
 
Analytic innovation transforming instagram data into predicitive analytics wi...
suresh sood
 
MongoDB - How to model and extract your data
Francesco Lo Franco
 
Realtime Analytics with MongoDB Counters (mongonyc 2012)
Scott Hernandez
 
Analytic Data Report with MongoDB
Li Jia Li
 
Mongo db aggregation guide
Deysi Gmarra
 
Social Data and Log Analysis Using MongoDB
Takahiro Inoue
 
MongoDB Tick Data Presentation
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Formant
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB
 
MongoDB Europe 2016 - Using MongoDB to Build a Fast and Scalable Content Repo...
MongoDB
 
Data Processing and Aggregation with MongoDB
MongoDB
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB
 
MongoDB for Spatio-Behavioral Data Analysis and Visualization
MongoDB
 
MongoDB Europe 2016 - Choosing Between 100 Billion Travel Options – Instant S...
MongoDB
 
MongoDB for Time Series Data: Schema Design
MongoDB
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Ad

Similar to Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation (20)

PDF
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
PPTX
Marc s01 e02-crud-database
MongoDB
 
PDF
MongoDB FabLab León
Juan Antonio Roy Couto
 
PDF
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB
 
PDF
Using MongoDB and Python
Mike Bright
 
PDF
2016 feb-23 pyugre-py_mongo
Michael Bright
 
PPTX
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
Gianfranco Palumbo
 
PDF
MongoDB.pdf
KuldeepKumar778733
 
PPTX
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
MongoDB
 
PPTX
Webinar: Building Your First Application with MongoDB
MongoDB
 
PDF
MongoDB Meetup
Maxime Beugnet
 
PPTX
S01 e00 einfuehrung-in_mongodb
MongoDB
 
PPTX
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
PPT
Schema Design by Chad Tindel, Solution Architect, 10gen
MongoDB
 
PDF
MongoDB Atlas Workshop - Singapore
Ashnikbiz
 
PDF
Mongo db eveningschemadesign
MongoDB APAC
 
PDF
MongoDB and the MEAN Stack
MongoDB
 
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar
 
PPTX
Past, Present and Future of Data Processing in Apache Hadoop
Codemotion
 
PDF
Confluent & MongoDB APAC Lunch & Learn
confluent
 
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
Marc s01 e02-crud-database
MongoDB
 
MongoDB FabLab León
Juan Antonio Roy Couto
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB
 
Using MongoDB and Python
Mike Bright
 
2016 feb-23 pyugre-py_mongo
Michael Bright
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
Gianfranco Palumbo
 
MongoDB.pdf
KuldeepKumar778733
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
MongoDB
 
Webinar: Building Your First Application with MongoDB
MongoDB
 
MongoDB Meetup
Maxime Beugnet
 
S01 e00 einfuehrung-in_mongodb
MongoDB
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
Schema Design by Chad Tindel, Solution Architect, 10gen
MongoDB
 
MongoDB Atlas Workshop - Singapore
Ashnikbiz
 
Mongo db eveningschemadesign
MongoDB APAC
 
MongoDB and the MEAN Stack
MongoDB
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar
 
Past, Present and Future of Data Processing in Apache Hadoop
Codemotion
 
Confluent & MongoDB APAC Lunch & Learn
confluent
 

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Software Development Methodologies in 2025
KodekX
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Doc9.....................................
SofiaCollazos
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 

Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation

  • 1. Solutions Architect, MongoDB Marc Schwering #MongoDBBasics @MongoDB @m4rcsch Applikationsentwicklung mit MongoDB Reporting & Aggregation
  • 2. 2 • Recap from last session • Reporting / Analytics options • Map Reduce • Aggregation Framework introduction – Aggregation explain • mycms application reports • Geospatial with Aggregation Framework • Text Search with Aggregation Framework Agenda
  • 3. 3 • Virtual Genius Bar – Use the chat to post questions – EMEA Solution Architecture / Support team are on hand – Make use of them during the sessions!!! Q & A
  • 4. Recap from last time….
  • 5. 5 Indexing • Indexes • Multikey, compound, ‘dot.notation’ • Covered, sorting • Text, GeoSpatial • Btrees >db.articles.ensureIndex( { author : 1, tags : 1 } ) >db.user.find({user:"danr"}, {_id:0, password:1}) >db.articles.ensureIndex( { location: “2dsphere” } ) >>db.articles.ensureIndex( { "$**" : “text”, name : “TextIndex”} ) options db.col.ensureIndex({ key : type})
  • 6. 6 Index performance / efficiency • Examine index plans • Identity slow queries • n / nscanned ratio • Which index used. operators .explain() , db profiler > db.articles.find( {author:'Dan Roberts’}) .sort({date:-1} ).explain() > db.setProfilingLevel(1, 100) { "was" : 0, "slowms" : 100, "ok" : 1 } > db.system.profile.find() .pretty()
  • 8. 8 • Query Language – Leverage pre aggregated documents • Aggregation Framework – Calculate new values from the data that we have – For instance : Average views, comments count • MapReduce – Internal Javascript based implementation – External Hadoop, using the MongoDB connector • A combination of the above Access data for reporting, options
  • 9. 9 • Immediate results – Simple from a query perspective. – Interactions collection Pre Aggregated Reports { ‘_id’ : ObjectId(..), ‘article_id’ : ObjectId(..), ‘section’ : ‘schema’, ‘date’ : ISODate(..), ‘daily’: { ‘views’ : 45, ‘comments’ : 150 } ‘hours’ : { 0 : { ‘views’ : 10 }, 1 : { ‘views’ : 2 }, … 23 : { ‘views’ : 14, ‘comments’ : 10 } } } > db.interactions.find( {"article_id" : ObjectId(”…..")}, {_id:0, hourly:1} )
  • 10. 10 • Use query result to display directly in application – Create new REST API – D3.js library or similar in UI Pre Aggregated Reports { "hourly" : { "0" : { "view" : 1 }, "1" : { "view" : 1 }, …… "22" : { "view" : 5 }, "23" : { "view" : 3 } } }
  • 12. 12 • Map Reduce – MongoDB – JavaScript • Incremental Map Reduce Map Reduce //Map Reduce Example > db.articles.mapReduce( function() { emit(this.author, this.comment_count); }, function(key, values) { return Array.sum (values) }, { query : {}, out: { merge: "comment_count" } } ) Output { "_id" : "Dan Roberts", "value" : 6 } { "_id" : "Jim Duffy", "value" : 1 } { "_id" : "Kunal Taneja", "value" : 2 } { "_id" : "Paul Done", "value" : 2 }
  • 13. 13 MongoDB – Hadoop Connector Hadoop Integration Primary Secondary Secondary HDFS Primary Secondary Secondary Primary Secondary Secondary Primary Secondary Secondary HDFS HDFS HDFS MapReduce MapReduce MapReduce MapReduce MongoS MongoSMongoS Application ApplicationApplication Application Dash Boards / Reporting 1) Data Flow, Input / Output via Application Tier
  • 15. 15 • Multi-stage pipeline – Like a unix pipe – • “ps -ef | grep mongod” – Aggregate data, Transform documents – Implemented in the core server Aggregation Framework //Find out which are the most popular tags… db.articles.aggregate([ { $unwind : "$tags" }, { $group : { _id : "$tags" , number : { $sum : 1 } } }, { $sort : { number : -1 } } ]) Output { "_id" : "mongodb", "number" : 6 } { "_id" : "nosql", "number" : 3 } { "_id" : "database", "number" : 1 } { "_id" : "aggregation", "number" : 1 } { "_id" : "node", "number" : 1 }
  • 16. 16 In our mycms application.. //Our new python example @app.route('/cms/api/v1.0/tag_counts', methods=['GET']) def tag_counts(): pipeline = [ { "$unwind" : "$tags" }, { "$group" : { "_id" : "$tags" , "number" : { "$sum" : 1 } } }, { "$sort" : { "number" : -1 } }] cur = db['articles'].aggregate(pipeline, cursor={}) # Check everything ok if not cur: abort(400) # iterate the cursor and add docs to a dict tags = [tag for tag in cur] return jsonify({'tags' : json.dumps(tags, default=json_util.default)})
  • 17. 17 • Pipeline and Expression operators Aggregation operators Pipeline $match $sort $limit $skip $project $unwind $group $geoNear $text $search Tip: Other operators for date, time, boolean and string manipulation Expression $addToSet $first $last $max $min $avg $push $sum Arithmetic $add $divide $mod $multiply $subtract Conditional $cond $ifNull Variables $let $map
  • 18. 18 • What reports and analytics do we need in our application? – Popular Tags – Popular Articles – Popular Locations – integration with Geo Spatial – Average views per hour or day Application Reports
  • 19. 19 • Unwind each ‘tags’ array • Group and count each one, then Sort • Output to new collection – Query from new collection so don’t need to compute for every request. Popular Tags db.articles.aggregate([ { $unwind : "$tags" }, { $group : { _id : "$tags" , number : { $sum : 1 } } }, { $sort : { number : -1 } }, { $out : "tags"} ])
  • 20. 20 • Top 5 articles by average daily views – Use the $avg operator – Use use $match to constrain data range • Utilise with $gt and $lt operators Popular Articles db.interactions.aggregate([ { {$match : { date : { $gt : ISODate("2014-02-20T00:00:00.000Z")}}}, {$group : {_id: "$article_id", a : { $avg : "$daily.view"}}}, {$sort : { a : -1}}, {$limit : 5} ]);
  • 21. 21 • Use Explain plan to ensure the efficient use of the index when querying. Aggregation Framework Explain db.interactions.aggregate([ {$group : {_id: "$article_id", a : { $avg : "$daily.view"}}}, {$sort : { a : -1}}, {$limit : 5} ], {explain : true} );
  • 22. 22 Explain output… { "stages" : [ { "$cursor" : { "query" : … }, "fields" : { … }, "plan" : { "cursor" : "BasicCursor", "isMultiKey" : false, "scanAndOrder" : false, "allPlans" : [ { "cursor" : "BasicCursor", "isMultiKey" : false, "scanAndOrder" : false } ] } } }, … "ok" : 1 }
  • 23. Geo Spatial & Text Search Aggregation
  • 24. 24 • $text operator with aggregation framework – All articles with MongoDB – Group by author, sort by comments count Text Search db.articles.aggregate([ { $match: { $text: { $search: "mongodb" } } }, { $group: { _id: "$author", comments: { $sum: "$comment_count" } } } {$sort : {comments: -1}}, ])
  • 25. 25 • $geoNear operator with aggregation framework – Again use geo operator in the $match statement. – Group by author, and article count. Utilise with Geo spatial db.articles.aggregate([ { $match: { location: { $geoNear : { $geometry : { type: "Point" ,coordinates : [-0.128, 51.507] } }, $maxDistance :5000} } }, { $group: { _id: "$author", articleCount: { $sum: 1 } } } ])
  • 27. 27 • Aggregating Data… – Map Reduce – Hadoop – Pre-Aggregated Reports – Aggregation Framework • Tune with Explain plan • Compute on the fly or Compute and store • Geospatial • Text Search Summary
  • 28. 28 – MongoDB World Recap! – Preview into the operation Series. Next Session – 9th July

Editor's Notes

  • #10: db.interactions.find({"article_id" : ObjectId("532198379fb5ba99a6bd4063")}) db.interactions.find({"article_id" : ObjectId("532198379fb5ba99a6bd4063")},{_id:0,hourly:1})