SlideShare a Scribd company logo
11
Analytics with NoSQL: why? for what? and when?
Edouard Servan-Schreiber, Ph.D.
Director for Solution Architecture
10gen
2
What is Analytics?
2
• Alerting
– Let me know when a cell tower has failed
• Getting insights - Strategic Analytics
– Churn rates, Customer segment distribution
• Transforming, Enriching, Aggregating
– Identifying faces in videos and images
– Identifying voices in recordings
• Operating smarter
– Having a pre-approved offer for a customer who calls after he expressed
interest on the web
• Analytics-driven actions in real-time
– Smart modeling integrating real time context
– This customer has lower status but suffered multiple delays in past
month, and should have priority over this higher status customer right
now on this flight
3
Why is this hard?
• Lots of data
– but few eyes and slow brains
• Lots of data
– just as many formats
• Lots of data
– many owners with unaligned interests and concerns
• Can you get your analysis in a useful timeframe?
• Can you make improvements in a useful timeframe?
• When you get new data, how fast can you do something with it?
• The more DATA you have, the easier it is to get lost in it...
• Data is useful only if it allows you to CHANGE the way you run your activity
– this is a surprisingly useful litmus test
• Any change requires measurement to make sure it helps
– this is a remarkably effective test to identify analytical organizations
3
4
Seven vital success areas
CRISP-DM methodology
Data
4
Data
Many Data Sources and Schemas
Hard to Integrate
Keeps evolving
Acting on “real time”
data
Is particularly hard
5
Collaborative Filtering
“Those who saw this also liked this….”
• Real time continuous updates of the user-product matrix
to make up-to-date predictions
5
6
Credit Card Fraud
Complex Event Processing
• Each transaction must be approved in a
matter of seconds. Each step, the relevant
authority must decide in real-time whether
the transaction is suspicious enough to
warrant an alert, refuting the transaction
6
Approaches to Model Scoring
88
Once you have built insights, the hard part is turning those insights into
money making actions through a multitude of field systems
Actions are taken in field systems….
DWHSensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Data is built here and action is taken here
Long running batch
analysis
Development of
Stats Models
Integration of
Enterprise Data
99
• Once you have built insights, the hard part is turning those insights
into money making actions through a multitude of field systems
Actions are taken in field systems….
DWHSensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Data is built here and action is taken here
BIG
ETL
Mess
1010
Once you have built insights, the hard part is turning those insights into
money making actions through a multitude of field systems
Actions are taken in field systems….
DWH
Sensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Operational Pre-aggregation
BIG Moveable
Normal
ETL
Mess
1111
MongoDB Strategic Advantages
Horizontally Scalable
-Sharding
Agile
Flexible
High Performance
Strong Consistency
Application
Highly
Available
-Replica Sets
{ author: “roger”,
date: new Date(),
text: “Spirited Away”,
tags: [“Tezuka”, “Manga”]}
+Aggregation
Framework
+MapReduce
Framework
1212
Document-oriented data model (JSON-Style)
{
_id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
model: ”101 jet engine",
date : ISODate(“24-07-2010”),
purchaser: “Emirates”,
aircraft: {
type: “Boeing 747-400”,
first_flight: ISODate(“01-11-2010”)
registration: 3467892
}
manufacturing_plant: 8374
parts : [
{ partid: 132467589648762348765,
description: “blade”,
source: “some vendor”,
.....
},
{ partid: 9584352845569846,
description: “injector”,
source: “some vendor”,
.....
},
sensor_list: [sensorid1, sensoriid2, sensorid3,....]
}
www.bsonspec.org
13
Use Cases
• Retail:
– Price Optimization
• Utilities and Manufacturing:
– Using smart meter data, optimizing the flow of
electrical power to maximize yield and usage
– Sensor data from vehicles to build truck fleet analytics
in real time
• Telco:
– Geo-based advertising, delivering relevant ads based
on interest and locality
– Smart call routing taking into account saturated cell
towers and customer value
13
14
Use Cases
• Gov: City of Chicago (WindyGrid)
– Based on reports of maintenance needs (e.g.
broken streetlights), dispatching police in
targeted ways to reduce crime
• Financial Services: MetLife (The Wall)
– Moving from a policy centric view to a
customer centric view, enabling informed
upsell and cross sell offers based on historical
analysis and recent activity
14
15
How does MongoDB help for these?
• Agility to compute and aggregate in place
– All
• Agility to add new data to existing schema
– Price Optimization
• High scalable performance to ingest
operational data
– Sensor data
• High scalable performance to serve
operational analytics
– Metlife, Telco 15
16
NoSQL and Analytics
16
Tech Dev Time
Exec
latency
Exec
Power
Data
Transfer
Functional
Depth
Hadoop * * ***** ** *****
MongoDB ***** ***** *** ***** **
Cassandra
with
Hadoop
* * ***** ***** *****
DWH *** ***** ***** ** ****
SAS ***** ***** ** * *****
17
Conclusions
• Analytics are no longer just batch
• Analytics requires integrating the real time
context
• Big Data is putting pressure to process
data where it lands
• New sources and forms of data are making
it difficult to stick to RDBMS rigidity
• MongoDB can help you
17

More Related Content

PPTX
geniSIGHTS Mini
geniSIGHTS
 
PDF
geniSIGHTS offerings on Retail- etail Latest
Aaum Research and Analytics Private Limited
 
PDF
geniSIGHTS offerings on Travel - Latest
Aaum Research and Analytics Private Limited
 
PPTX
Yellowbrick MicroStrategy webcast
Yellowbrick Data
 
PDF
Next-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Denodo
 
PDF
Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie
 
PDF
Tom Martens - Cube Ware - The big data challenge - bo
Sogeti Nederland B.V.
 
PDF
Demystifying AI-chatbots Just add CUI to your business apps
Grid Dynamics
 
geniSIGHTS Mini
geniSIGHTS
 
geniSIGHTS offerings on Retail- etail Latest
Aaum Research and Analytics Private Limited
 
geniSIGHTS offerings on Travel - Latest
Aaum Research and Analytics Private Limited
 
Yellowbrick MicroStrategy webcast
Yellowbrick Data
 
Next-Gen Cloud Analytics with AWS, Big Data and Data Virtualization
Denodo
 
Intelie's Overview - How much could your company lose in a matter of minutes?
Intelie
 
Tom Martens - Cube Ware - The big data challenge - bo
Sogeti Nederland B.V.
 
Demystifying AI-chatbots Just add CUI to your business apps
Grid Dynamics
 

What's hot (20)

PDF
Data Science for Finance
TheClickReader
 
PPTX
Data science in finance industry
Institute of Contemporary Sciences
 
PPTX
Text analytics opportunities in the Insurance domain
bsamar99
 
PDF
DWS17 - Plenary Session : Big technological bets - Anukool LAKIHINA - Guavus
IDATE DigiWorld
 
PDF
Denodo Datafest 2017 London Tekin Mentes Logitech
Tekin Mentes
 
PDF
Gopalakrishna: big data consultant
Gopalakrishna Palem
 
PDF
QuanTemplate-Underwriting-Performance
Richard Bowdler
 
PDF
Transformation of Sales and Marketing by Rene van der Laan
Fima Rosyidah
 
PDF
EVAM_Streaming Analytics_v1.5
John Nikolaidis
 
PDF
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
PDF
QuanTemplate-data-managment
Richard Bowdler
 
PPTX
Internet of things & predictive analytics
Prasad Narasimhan
 
PPTX
#gaucbe - Closing the loop between your Analytics and marketing tools
Intracto digital agency
 
PDF
16h00 globant - aws globant-big-data_summit2012
infolive
 
PDF
Big Data Analytics - GTech Seminar
Bijilash Babu
 
PDF
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
Databricks
 
PPTX
Impact of big data on DCMI market
Mohsin Baig
 
PDF
G&S QUOTIENT
pragyagupta112
 
PPTX
Daten getriebene Service Intelligence mit Splunk ITSI
Splunk
 
PPTX
Why MicroStrategy
BigClasses Com
 
Data Science for Finance
TheClickReader
 
Data science in finance industry
Institute of Contemporary Sciences
 
Text analytics opportunities in the Insurance domain
bsamar99
 
DWS17 - Plenary Session : Big technological bets - Anukool LAKIHINA - Guavus
IDATE DigiWorld
 
Denodo Datafest 2017 London Tekin Mentes Logitech
Tekin Mentes
 
Gopalakrishna: big data consultant
Gopalakrishna Palem
 
QuanTemplate-Underwriting-Performance
Richard Bowdler
 
Transformation of Sales and Marketing by Rene van der Laan
Fima Rosyidah
 
EVAM_Streaming Analytics_v1.5
John Nikolaidis
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
QuanTemplate-data-managment
Richard Bowdler
 
Internet of things & predictive analytics
Prasad Narasimhan
 
#gaucbe - Closing the loop between your Analytics and marketing tools
Intracto digital agency
 
16h00 globant - aws globant-big-data_summit2012
infolive
 
Big Data Analytics - GTech Seminar
Bijilash Babu
 
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
Databricks
 
Impact of big data on DCMI market
Mohsin Baig
 
G&S QUOTIENT
pragyagupta112
 
Daten getriebene Service Intelligence mit Splunk ITSI
Splunk
 
Why MicroStrategy
BigClasses Com
 
Ad

Viewers also liked (6)

PDF
Webinar: NoSQL as the New Normal
MongoDB
 
PPTX
Big data webinar-series-pt5 v2
MongoDB
 
PPT
Tricks
MongoDB
 
KEY
Scaling with MongoDB
MongoDB
 
PPTX
Why mongo db was created - Dwight Merriman - MongoSF 2011
MongoDB
 
KEY
2011 mongo sf-scaling
MongoDB
 
Webinar: NoSQL as the New Normal
MongoDB
 
Big data webinar-series-pt5 v2
MongoDB
 
Tricks
MongoDB
 
Scaling with MongoDB
MongoDB
 
Why mongo db was created - Dwight Merriman - MongoSF 2011
MongoDB
 
2011 mongo sf-scaling
MongoDB
 
Ad

Similar to Webinar: Analytics with NoSQL: Why, for What, and When? (20)

PDF
Data reply sneak peek: real time decision engines
confluent
 
PDF
Open Blueprint for Real-Time Analytics with In-Stream Processing
Grid Dynamics
 
PDF
Solutions Using WSO2 Analytics
WSO2
 
PDF
Platforming the Major Analytic Use Cases for Modern Engineering
DATAVERSITY
 
PDF
Mobile, Wearables, Big Data and A Strategy to Move Forward (with NTT Data Ent...
Barcoding, Inc.
 
PDF
Harnessing Big Data_UCLA
Paul Barsch
 
PDF
Open Blueprint for Real-Time Analytics with In-Stream Processing (ISP); 2017 ...
Grid Dynamics
 
PDF
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Grid Dynamics
 
PPTX
StreamCentral for the IT Professional
Raheel Retiwalla
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PDF
Implementing Advanced Analytics Platform
Arvind Sathi
 
PPTX
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
 
PDF
Operationalizing Customer Analytics with Azure and Power BI
CCG
 
PPT
Webinar: Making A Single View of the Customer Real with MongoDB
MongoDB
 
PDF
Taming Big Data With Modern Software Architecture
Big Data User Group Karlsruhe/Stuttgart
 
PDF
Moving To MicroServices
David Walker
 
PDF
Confluent Partner Tech Talk with BearingPoint
confluent
 
PPTX
Big data analytics
Amr Kamel Deklel
 
PDF
Vitria IoT Analytics Platform
Abhishek Sood
 
PDF
Cloud 2020: taking your customers into the future - Peter Schwartz Avanxo Clo...
Avanxo
 
Data reply sneak peek: real time decision engines
confluent
 
Open Blueprint for Real-Time Analytics with In-Stream Processing
Grid Dynamics
 
Solutions Using WSO2 Analytics
WSO2
 
Platforming the Major Analytic Use Cases for Modern Engineering
DATAVERSITY
 
Mobile, Wearables, Big Data and A Strategy to Move Forward (with NTT Data Ent...
Barcoding, Inc.
 
Harnessing Big Data_UCLA
Paul Barsch
 
Open Blueprint for Real-Time Analytics with In-Stream Processing (ISP); 2017 ...
Grid Dynamics
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Grid Dynamics
 
StreamCentral for the IT Professional
Raheel Retiwalla
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Implementing Advanced Analytics Platform
Arvind Sathi
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
 
Operationalizing Customer Analytics with Azure and Power BI
CCG
 
Webinar: Making A Single View of the Customer Real with MongoDB
MongoDB
 
Taming Big Data With Modern Software Architecture
Big Data User Group Karlsruhe/Stuttgart
 
Moving To MicroServices
David Walker
 
Confluent Partner Tech Talk with BearingPoint
confluent
 
Big data analytics
Amr Kamel Deklel
 
Vitria IoT Analytics Platform
Abhishek Sood
 
Cloud 2020: taking your customers into the future - Peter Schwartz Avanxo Clo...
Avanxo
 

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Recently uploaded (20)

PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
The Future of Artificial Intelligence (AI)
Mukul
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 

Webinar: Analytics with NoSQL: Why, for What, and When?

  • 1. 11 Analytics with NoSQL: why? for what? and when? Edouard Servan-Schreiber, Ph.D. Director for Solution Architecture 10gen
  • 2. 2 What is Analytics? 2 • Alerting – Let me know when a cell tower has failed • Getting insights - Strategic Analytics – Churn rates, Customer segment distribution • Transforming, Enriching, Aggregating – Identifying faces in videos and images – Identifying voices in recordings • Operating smarter – Having a pre-approved offer for a customer who calls after he expressed interest on the web • Analytics-driven actions in real-time – Smart modeling integrating real time context – This customer has lower status but suffered multiple delays in past month, and should have priority over this higher status customer right now on this flight
  • 3. 3 Why is this hard? • Lots of data – but few eyes and slow brains • Lots of data – just as many formats • Lots of data – many owners with unaligned interests and concerns • Can you get your analysis in a useful timeframe? • Can you make improvements in a useful timeframe? • When you get new data, how fast can you do something with it? • The more DATA you have, the easier it is to get lost in it... • Data is useful only if it allows you to CHANGE the way you run your activity – this is a surprisingly useful litmus test • Any change requires measurement to make sure it helps – this is a remarkably effective test to identify analytical organizations 3
  • 4. 4 Seven vital success areas CRISP-DM methodology Data 4 Data Many Data Sources and Schemas Hard to Integrate Keeps evolving Acting on “real time” data Is particularly hard
  • 5. 5 Collaborative Filtering “Those who saw this also liked this….” • Real time continuous updates of the user-product matrix to make up-to-date predictions 5
  • 6. 6 Credit Card Fraud Complex Event Processing • Each transaction must be approved in a matter of seconds. Each step, the relevant authority must decide in real-time whether the transaction is suspicious enough to warrant an alert, refuting the transaction 6
  • 8. 88 Once you have built insights, the hard part is turning those insights into money making actions through a multitude of field systems Actions are taken in field systems…. DWHSensor Store Order Store Inventory Mgmt Warranty Mgmt Customer Portal Analytical Store Data is built here and action is taken here Long running batch analysis Development of Stats Models Integration of Enterprise Data
  • 9. 99 • Once you have built insights, the hard part is turning those insights into money making actions through a multitude of field systems Actions are taken in field systems…. DWHSensor Store Order Store Inventory Mgmt Warranty Mgmt Customer Portal Analytical Store Data is built here and action is taken here BIG ETL Mess
  • 10. 1010 Once you have built insights, the hard part is turning those insights into money making actions through a multitude of field systems Actions are taken in field systems…. DWH Sensor Store Order Store Inventory Mgmt Warranty Mgmt Customer Portal Analytical Store Operational Pre-aggregation BIG Moveable Normal ETL Mess
  • 11. 1111 MongoDB Strategic Advantages Horizontally Scalable -Sharding Agile Flexible High Performance Strong Consistency Application Highly Available -Replica Sets { author: “roger”, date: new Date(), text: “Spirited Away”, tags: [“Tezuka”, “Manga”]} +Aggregation Framework +MapReduce Framework
  • 12. 1212 Document-oriented data model (JSON-Style) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), model: ”101 jet engine", date : ISODate(“24-07-2010”), purchaser: “Emirates”, aircraft: { type: “Boeing 747-400”, first_flight: ISODate(“01-11-2010”) registration: 3467892 } manufacturing_plant: 8374 parts : [ { partid: 132467589648762348765, description: “blade”, source: “some vendor”, ..... }, { partid: 9584352845569846, description: “injector”, source: “some vendor”, ..... }, sensor_list: [sensorid1, sensoriid2, sensorid3,....] } www.bsonspec.org
  • 13. 13 Use Cases • Retail: – Price Optimization • Utilities and Manufacturing: – Using smart meter data, optimizing the flow of electrical power to maximize yield and usage – Sensor data from vehicles to build truck fleet analytics in real time • Telco: – Geo-based advertising, delivering relevant ads based on interest and locality – Smart call routing taking into account saturated cell towers and customer value 13
  • 14. 14 Use Cases • Gov: City of Chicago (WindyGrid) – Based on reports of maintenance needs (e.g. broken streetlights), dispatching police in targeted ways to reduce crime • Financial Services: MetLife (The Wall) – Moving from a policy centric view to a customer centric view, enabling informed upsell and cross sell offers based on historical analysis and recent activity 14
  • 15. 15 How does MongoDB help for these? • Agility to compute and aggregate in place – All • Agility to add new data to existing schema – Price Optimization • High scalable performance to ingest operational data – Sensor data • High scalable performance to serve operational analytics – Metlife, Telco 15
  • 16. 16 NoSQL and Analytics 16 Tech Dev Time Exec latency Exec Power Data Transfer Functional Depth Hadoop * * ***** ** ***** MongoDB ***** ***** *** ***** ** Cassandra with Hadoop * * ***** ***** ***** DWH *** ***** ***** ** **** SAS ***** ***** ** * *****
  • 17. 17 Conclusions • Analytics are no longer just batch • Analytics requires integrating the real time context • Big Data is putting pressure to process data where it lands • New sources and forms of data are making it difficult to stick to RDBMS rigidity • MongoDB can help you 17