SlideShare a Scribd company logo
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
http://blog.nahurst.com/visual-guide-to-nosql-systems
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
{

    "_id" : ObjectId("4dcd3ebc9278000000005158"),

    "timestamp" : ISODate("2011-05-13T14:22:46.777Z"),

    "binary" : BinData(0,""),

    "string" : "abc",

    "number" : 3,

    "subobj" : {"subA": 1, "subB": { "subsubC": 2 }},

    "array" : [1, 2, 3],

    "dbref" : [_id1, _id2, _id3]

}
{

      "_id" : ObjectId("4dcd3ebc9278000000005158"),

      "nickname" : "doryokujin"

},{

      "_id" : ObjectId("4dcd3ebc9278000000005159"),

      "firstname" : "Takahiro",

      "lastname" : "Inoue",

      "mail" : "mr.stoicman@gmail.com",

      "twitter" : "@doryokujin"

},...
{

    "_id" : ObjectId("4dcd3ebc9278000000005158"),

    "timestamp" : ISODate("2011-05-13T14:22:46.777Z"),

    "binary" : BinData(0,""),

    "string" : "abc",

    "number" : 3,

    "subobj" : {"subA": 1, "subB": 2 },

    "array" : [1, 2, 3],

                        padding
}
{

    "_id" : ObjectId("4dcd3ebc9278000000005158"),

    "timestamp" : ISODate("2011-05-13T14:22:46.777Z"),

    "binary" : BinData(0,""),

    "string" : "def",

    "number" : 4,

    "subobj" : {"subA": 1, "subB": 2 },

    "array" : [1, 2, 3, 4, 5, 6],

    "newkey" : "In-place"

}
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
{

    "_id" : ObjectId("4dcd3ebc9278000000005158"),

    "timestamp" : ISODate("2011-05-13T14:22:46.777Z"),

    "binary" : BinData(0,""),

    "string" : "abc",

    "number" : 3,

    "subobj" : {"subA": 1, "subB": 2 },

    "array" : [1, 2, 3],



}
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
Cluster
                                   Shard Servers (Data)
   config Servers
 (Shard Configration)   shard1           shard2           shard3
                        [ a, f )          [ k, n)         [ o, t )   Chunk
                        [ f, k )          [ n, o )        [ t, } )




                               mongos Servers (Routers)
Shard

      1
 ( mongos
primary     primary
        )
                      Shard




cinfig      Shard
                              mongos
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
Every Server:
Large HDD(500GB   )
Large Memory(16GB     )


                  Slave Delay in      Master Data on
                  Preparation for        AmazonS3
                    User Error




                                    Master Data on S3
                                      Non Sharding,
                                       Replica Set
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
From Text Logs: (Large)
From mySQL: (Small)
From Other NoSQL: (Middle)


Temporary Raw data storage
   is HDFS, not MongoDB


 We only need result data
 to discover new features
Reduction




  First            Second
Aggregation      Aggregation
ScientificPython
Fluent
                            Structured logging

                            Pluggable architecture

                            Reliable forwarding
e Event Collector Service
Fluent
                               Structured logging

                               Pluggable architecture

                               Reliable forwarding
   e Event Collector Service



Sadayuki Furuhashi
Treasure Data, Inc.
@frsyuki
“2011-04-01 host1 myapp: cmessage size=12MB user=me”


2011-04-01 myapp.message {
    “on_host”: ”host1”,
                                 2011-04-01 myapp.message {

    ”combined”: true,                “on_host”: ”host1”,
                                     ”combined”: true,

    “size”: 12000000,                “size”: 12000000,
                                     “user”: “me”

    “user”: “me”                 }



}
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data
log   log   log   log
MongoDB: Intro & Application  for Big Data
aggregate       aggregate           aggregate                aggregate




     log             log                 log                      log




      key1                   key2                     key3     shuffle




    aggregate         aggregate           aggregate

                 aggregate
MongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application  for Big Data

More Related Content

What's hot (20)

PDF
Omnibus database machine
Aleck Landgraf
 
PPTX
MongoDB Shell Tips & Tricks
MongoDB
 
PPTX
Back to Basics, webinar 2: La tua prima applicazione MongoDB
MongoDB
 
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
PPTX
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
PDF
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
PostgresOpen
 
PDF
MongoDB and Python
Norberto Leite
 
PPTX
Redis data modeling examples
Terry Cho
 
PDF
Indexing
Mike Dirolf
 
PPTX
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
PPTX
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
Jason Terpko
 
PDF
Getting Started with MongoDB
Michael Redlich
 
PPTX
Bucket your partitions wisely - Cassandra summit 2016
Markus Höfer
 
PPTX
Operational Intelligence with MongoDB Webinar
MongoDB
 
KEY
Introduction to MongoDB
Alex Bilbie
 
PPTX
MongoDB Chunks - Distribution, Splitting, and Merging
Jason Terpko
 
PDF
Superficial mongo db
DaeMyung Kang
 
PDF
MongoDB Performance Tuning
MongoDB
 
PPTX
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Olga Lavrentieva
 
Omnibus database machine
Aleck Landgraf
 
MongoDB Shell Tips & Tricks
MongoDB
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
MongoDB
 
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
PostgresOpen
 
MongoDB and Python
Norberto Leite
 
Redis data modeling examples
Terry Cho
 
Indexing
Mike Dirolf
 
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
Jason Terpko
 
Getting Started with MongoDB
Michael Redlich
 
Bucket your partitions wisely - Cassandra summit 2016
Markus Höfer
 
Operational Intelligence with MongoDB Webinar
MongoDB
 
Introduction to MongoDB
Alex Bilbie
 
MongoDB Chunks - Distribution, Splitting, and Merging
Jason Terpko
 
Superficial mongo db
DaeMyung Kang
 
MongoDB Performance Tuning
MongoDB
 
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Olga Lavrentieva
 

Viewers also liked (20)

PPTX
MongoDB
jsterce
 
PPT
A Brief MongoDB Intro
Scott Hernandez
 
PDF
Introduction to MongoDB
Justin Smestad
 
PPTX
Introduction to MongoDB
MongoDB
 
PDF
Mongo DB
Edureka!
 
PDF
Intro To MongoDB
Alex Sharp
 
PPT
Introduction to MongoDB
Ravi Teja
 
PPTX
An Introduction To NoSQL & MongoDB
Lee Theobald
 
PPTX
Intro to the MongoDB Community
francescapasha
 
PDF
An Introduction to Fluent & MongoDB Plugins
Takahiro Inoue
 
PPTX
MongoDB: Mastering the shell
Scott Hernandez
 
PDF
Mongo db intro & tips
InBum Kim
 
PPTX
3 scenarios when to use MongoDB!
Edureka!
 
PDF
Webinar Slides: Become a MongoDB DBA (if you’re really a MySQL user)
Severalnines
 
PPTX
Webinar: An Enterprise Architect’s View of MongoDB
MongoDB
 
PPT
MongoDB Tick Data Presentation
MongoDB
 
PPTX
Database Trends for Modern Applications: Why the Database You Choose Matters
MongoDB
 
KEY
OSCON 2012 MongoDB Tutorial
Steven Francia
 
ODP
Seth Edwards on MongoDB
Skills Matter
 
PPTX
Schema design with MongoDB (Dwight Merriman)
MongoSF
 
MongoDB
jsterce
 
A Brief MongoDB Intro
Scott Hernandez
 
Introduction to MongoDB
Justin Smestad
 
Introduction to MongoDB
MongoDB
 
Mongo DB
Edureka!
 
Intro To MongoDB
Alex Sharp
 
Introduction to MongoDB
Ravi Teja
 
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Intro to the MongoDB Community
francescapasha
 
An Introduction to Fluent & MongoDB Plugins
Takahiro Inoue
 
MongoDB: Mastering the shell
Scott Hernandez
 
Mongo db intro & tips
InBum Kim
 
3 scenarios when to use MongoDB!
Edureka!
 
Webinar Slides: Become a MongoDB DBA (if you’re really a MySQL user)
Severalnines
 
Webinar: An Enterprise Architect’s View of MongoDB
MongoDB
 
MongoDB Tick Data Presentation
MongoDB
 
Database Trends for Modern Applications: Why the Database You Choose Matters
MongoDB
 
OSCON 2012 MongoDB Tutorial
Steven Francia
 
Seth Edwards on MongoDB
Skills Matter
 
Schema design with MongoDB (Dwight Merriman)
MongoSF
 
Ad

Similar to MongoDB: Intro & Application for Big Data (20)

KEY
Scaling MongoDB (Mongo Austin)
MongoDB
 
PDF
はじめてのMongoDB
Takahiro Inoue
 
PPTX
MongoDB for Time Series Data: Sharding
MongoDB
 
PPTX
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
PPT
MongoDB Basic Concepts
MongoDB
 
PDF
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
Chris Richardson
 
PPT
MongoDB Knowledge Shareing
Philip Zhong
 
PPTX
MongoDB Live Hacking
Tobias Trelle
 
KEY
Managing Social Content with MongoDB
MongoDB
 
KEY
2010 mongo sv-shardinginternals
MongoDB
 
PPTX
Scaling with MongoDB
Rick Copeland
 
KEY
2011 mongo sf-sharding
MongoDB
 
KEY
Scaling with MongoDB
MongoDB
 
PPT
HPTS talk on micro-sharding with Katta
Ted Dunning
 
ODP
MongoDB Devops Madrid February 2012
Juan Vicente Herrera Ruiz de Alejo
 
PDF
Mongodb workshop
Harun Yardımcı
 
PPTX
Webinar: Building Your First Application with MongoDB
MongoDB
 
PDF
MongoDB: Replication,Sharding,MapReduce
Takahiro Inoue
 
PDF
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
Dipti Borkar
 
PDF
Transition from relational to NoSQL Philly DAMA Day
Dipti Borkar
 
Scaling MongoDB (Mongo Austin)
MongoDB
 
はじめてのMongoDB
Takahiro Inoue
 
MongoDB for Time Series Data: Sharding
MongoDB
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
MongoDB Basic Concepts
MongoDB
 
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
Chris Richardson
 
MongoDB Knowledge Shareing
Philip Zhong
 
MongoDB Live Hacking
Tobias Trelle
 
Managing Social Content with MongoDB
MongoDB
 
2010 mongo sv-shardinginternals
MongoDB
 
Scaling with MongoDB
Rick Copeland
 
2011 mongo sf-sharding
MongoDB
 
Scaling with MongoDB
MongoDB
 
HPTS talk on micro-sharding with Katta
Ted Dunning
 
MongoDB Devops Madrid February 2012
Juan Vicente Herrera Ruiz de Alejo
 
Mongodb workshop
Harun Yardımcı
 
Webinar: Building Your First Application with MongoDB
MongoDB
 
MongoDB: Replication,Sharding,MapReduce
Takahiro Inoue
 
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
Dipti Borkar
 
Transition from relational to NoSQL Philly DAMA Day
Dipti Borkar
 
Ad

More from Takahiro Inoue (20)

PDF
Treasure Data × Wave Analytics EC Demo
Takahiro Inoue
 
PDF
トレジャーデータとtableau実現する自動レポーティング
Takahiro Inoue
 
PDF
Tableauが魅せる Data Visualization の世界
Takahiro Inoue
 
PDF
トレジャーデータのバッチクエリとアドホッククエリを理解する
Takahiro Inoue
 
PDF
20140708 オンラインゲームソリューション
Takahiro Inoue
 
PDF
トレジャーデータ流,データ分析の始め方
Takahiro Inoue
 
PDF
オンラインゲームソリューション@トレジャーデータ
Takahiro Inoue
 
PDF
事例で学ぶトレジャーデータ 20140612
Takahiro Inoue
 
PDF
トレジャーデータ株式会社について(for all Data_Enthusiast!!)
Takahiro Inoue
 
PDF
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
Takahiro Inoue
 
PDF
Treasure Data Intro for Data Enthusiast!!
Takahiro Inoue
 
PDF
Hadoop and the Data Scientist
Takahiro Inoue
 
PDF
An Introduction to Tinkerpop
Takahiro Inoue
 
PDF
An Introduction to Neo4j
Takahiro Inoue
 
PDF
The Definition of GraphDB
Takahiro Inoue
 
PDF
Large-Scale Graph Processing〜Introduction〜(完全版)
Takahiro Inoue
 
PDF
Large-Scale Graph Processing〜Introduction〜(LT版)
Takahiro Inoue
 
PDF
Advanced MongoDB #1
Takahiro Inoue
 
PDF
はじめてのGlusterFS
Takahiro Inoue
 
PDF
MongoDB & Hadoop: Flexible Hourly Batch Processing Model
Takahiro Inoue
 
Treasure Data × Wave Analytics EC Demo
Takahiro Inoue
 
トレジャーデータとtableau実現する自動レポーティング
Takahiro Inoue
 
Tableauが魅せる Data Visualization の世界
Takahiro Inoue
 
トレジャーデータのバッチクエリとアドホッククエリを理解する
Takahiro Inoue
 
20140708 オンラインゲームソリューション
Takahiro Inoue
 
トレジャーデータ流,データ分析の始め方
Takahiro Inoue
 
オンラインゲームソリューション@トレジャーデータ
Takahiro Inoue
 
事例で学ぶトレジャーデータ 20140612
Takahiro Inoue
 
トレジャーデータ株式会社について(for all Data_Enthusiast!!)
Takahiro Inoue
 
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
Takahiro Inoue
 
Treasure Data Intro for Data Enthusiast!!
Takahiro Inoue
 
Hadoop and the Data Scientist
Takahiro Inoue
 
An Introduction to Tinkerpop
Takahiro Inoue
 
An Introduction to Neo4j
Takahiro Inoue
 
The Definition of GraphDB
Takahiro Inoue
 
Large-Scale Graph Processing〜Introduction〜(完全版)
Takahiro Inoue
 
Large-Scale Graph Processing〜Introduction〜(LT版)
Takahiro Inoue
 
Advanced MongoDB #1
Takahiro Inoue
 
はじめてのGlusterFS
Takahiro Inoue
 
MongoDB & Hadoop: Flexible Hourly Batch Processing Model
Takahiro Inoue
 

Recently uploaded (20)

PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 

MongoDB: Intro & Application for Big Data

  • 9. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": { "subsubC": 2 }}, "array" : [1, 2, 3], "dbref" : [_id1, _id2, _id3] }
  • 10. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "nickname" : "doryokujin" },{ "_id" : ObjectId("4dcd3ebc9278000000005159"), "firstname" : "Takahiro", "lastname" : "Inoue", "mail" : "[email protected]", "twitter" : "@doryokujin" },...
  • 11. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3], padding }
  • 12. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "def", "number" : 4, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3, 4, 5, 6], "newkey" : "In-place" }
  • 16. { "_id" : ObjectId("4dcd3ebc9278000000005158"), "timestamp" : ISODate("2011-05-13T14:22:46.777Z"), "binary" : BinData(0,""), "string" : "abc", "number" : 3, "subobj" : {"subA": 1, "subB": 2 }, "array" : [1, 2, 3], }
  • 26. Cluster Shard Servers (Data) config Servers (Shard Configration) shard1 shard2 shard3 [ a, f ) [ k, n) [ o, t ) Chunk [ f, k ) [ n, o ) [ t, } ) mongos Servers (Routers)
  • 27. Shard 1 ( mongos primary primary ) Shard cinfig Shard mongos
  • 35. Every Server: Large HDD(500GB ) Large Memory(16GB ) Slave Delay in Master Data on Preparation for AmazonS3 User Error Master Data on S3 Non Sharding, Replica Set
  • 38. From Text Logs: (Large) From mySQL: (Small) From Other NoSQL: (Middle) Temporary Raw data storage is HDFS, not MongoDB We only need result data to discover new features
  • 39. Reduction First Second Aggregation Aggregation
  • 41. Fluent Structured logging Pluggable architecture Reliable forwarding e Event Collector Service
  • 42. Fluent Structured logging Pluggable architecture Reliable forwarding e Event Collector Service Sadayuki Furuhashi Treasure Data, Inc. @frsyuki
  • 43. “2011-04-01 host1 myapp: cmessage size=12MB user=me” 2011-04-01 myapp.message { “on_host”: ”host1”, 2011-04-01 myapp.message { ”combined”: true, “on_host”: ”host1”, ”combined”: true, “size”: 12000000, “size”: 12000000, “user”: “me” “user”: “me” } }
  • 46. log log log log
  • 48. aggregate aggregate aggregate aggregate log log log log key1 key2 key3 shuffle aggregate aggregate aggregate aggregate