SlideShare a Scribd company logo
NoSQL Data Modeling
IdoFriedman.yml
Name: Ido Friedman,
Past:”SQL Server consultant,Instructor,Team Leader”
Present:”Data engineer and Architect,
Elasticsearch,CouchBase,MongoDB,Python”,…]
WorkPlace:Perion
WhenNotWorking:@Sea
Let’s talk
• What is the role of data modeling
• What does data modeling effect
Data models
Document Columnar Graph
Relational New SQL*
And
more..
Document
Data Domains
On line
Batch
Real time
analytics
Micro
batch
Streaming
Schema and structure
Schema
free
Structured
Unstruc
tured
Semi
Structured
• Who needs schemas?
• Schema description
Normalization/De-Normalization
• Born to reserve storage and keep data integrity (RI)
• Resolve data joining issues
• Performance aspects
Is it still relevant???
Normalization example
{"_index": "user_profiles",
"_type": "properties",
"_id": "25467834901804247006200168495554902214",
"_version": 4,
"_score": 1,
"_source": {
"app_package_id": 4495665825523018,
"device_id": "b94b29c3-f03f-4e43-a646-53708e025779",
"group_id": 876,
"customer_user_id": "",
"customer_device_id": "b2f5fbfb-5d05-01e9-32e9-a8b332e9a8b3",
"advertise_id": "",
"device_os": "android",
"device_os_comparable_version": "00000004.00000002.00000002.00000017",
"device_os_full_name": "android 4.2.2.17",
"android_id": "c93334e4e246b83c",
"manufacturer": "OPPO",
"model": "R1001",
"screen_width": 480,
"screen_height": 800,
"device_language": "vi",
"cpu": "",
"is_rooted": 1,
"is_jailbroken": 0,
"vendor": "perion",
"app_installation_time": 1435173851000,
"push_allowed": 1,
"operator": "Beeline VN",
"mcc": "452",
"mnc": "07",
"mac_address": "",
"created_at": 1435174685000,
"registered_in_desktop": "",
"short_country_code": "VN",
"updated_at": 1435259712000,
"app_package_name": "com.gingersoftware.android.keyboard",
"app_version_type": ""
}
}
"manufacturer": "OPPO",
{"_index": "events-2015-06-04",
"_type": "events",
"_id": "AU4-zg5nOG4dkiMGKrCx",
"_version": 1,
"_score": 1,
"_source": {
"numeric_value_unit": "key",
"event_type_name": "custom",
"text_value": "",
"event_date": 1433440149000,
"numeric_value": 13,
"quantity": 0,
"event_name": "Saved Tap",
"numeric_value_name": "taps saved",
"app_package_id": 1433440149000,
"api_key": “asasa1w121",
"device_id": "c5eb1fe0-8a77-41ef-9f79-ed7ed69d32e6"
}
}
"event_name": "Saved Tap",
"event_name": "SVD T",
"event_name": "Saved TP",
"event_name": "Saved",
Constraints
• No BIG Brother
• Data can't be verified once it leaves
the application
Transactions
• Atomicity
• Locking
• Rollback
Relations type
• One to One
• One to Many
• Many to Many
Relations example
1 to Many
City : Person
1 to 1
Employee: Resume
{
_id:101,
Name:Jason Voorhees,
Age: 99
Resume_ID:1004
}
{
_id:1004,
Jobs:[Cook]
Education:[Knifery]
Hobbies:[Murder]
Employee_id:101
}
{
"name" : "Dam Square, Amsterdam",
"location" : {
"type" : "polygon",
"coordinates" : [[
[ 4.89218, 52.37356 ],
[ 4.89205, 52.37276 ], ….…
]]}
}
{
_id:101,
Name:Jason Voorhees,
Age: 99
Resume_ID:1004
}
Many to Many
Student : Teacher
{
_id:101,
Name:Jason Voorhees,
Age: 99
Resume_ID:1004
Courses:[“Chainsaw 101”,”Axing”],
Teachers:[101]
}
{
_id:101,
Name: “Freddy Krueger”,
Age: 60
Resume_ID:1004
Skills: [{Skill:}]
}
Embedding
• Embed
• Known doc size
• Data is highly related
• No joins
• Don’t Embed
• Very large data sets
• Data is updated rapidly
Doesn’t fit ….
• Use the data model that most fits your needs
• Don’t be afraid of Polyglot Persistence
Polyglot Persistence
• Data usage patterns
• Readers vs. Writers
• Online vs. Batch
• Concurrency
• Issues
• Data freshness
• Data consistency
• System Coupling
Questions?

More Related Content

What's hot (20)

PPT
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 
PDF
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
PPS
SQL & NoSQL
Ahmad Awsaf-uz-zaman
 
KEY
NoSQL: Why, When, and How
BigBlueHat
 
KEY
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Alex Nguyen
 
PPTX
An Introduction To NoSQL & MongoDB
Lee Theobald
 
PPTX
Cool NoSQL on Azure with DocumentDB
Jan Hentschel
 
PPTX
Introduction à DocumentDB
MSDEVMTL
 
PDF
The What and Why of NoSql
Matias Cascallares
 
PPTX
MongoDB Schema Design by Examples
Hadi Ariawan
 
PPTX
Azure doc db (slideshare)
David Green
 
PPTX
NoSQL with ASP.NET MVC
Manoj Bandara
 
PPTX
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
SpringPeople
 
PPTX
MongoDB
Rony Gregory
 
PPTX
MongoDB
Anthony Slabinck
 
PDF
Introduction to mongo db
Rohit Bishnoi
 
PDF
Introduction to datomic
Siva Jagadeesan
 
PDF
Mongo db
Noman Ellahi
 
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
NoSQL: Why, When, and How
BigBlueHat
 
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Alex Nguyen
 
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Cool NoSQL on Azure with DocumentDB
Jan Hentschel
 
Introduction à DocumentDB
MSDEVMTL
 
The What and Why of NoSql
Matias Cascallares
 
MongoDB Schema Design by Examples
Hadi Ariawan
 
Azure doc db (slideshare)
David Green
 
NoSQL with ASP.NET MVC
Manoj Bandara
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
SpringPeople
 
MongoDB
Rony Gregory
 
Introduction to mongo db
Rohit Bishnoi
 
Introduction to datomic
Siva Jagadeesan
 
Mongo db
Noman Ellahi
 

Viewers also liked (20)

ODP
Practica 5 ntae
oksanazayats
 
PDF
Facebook
marthaisabelrojas
 
PDF
ประวัติ สุภัสสร
Supassron Thongnuch
 
PPTX
Aprendizaje invertido
HENRY NELSON LEAL FERNANDEZ
 
PDF
Prime Star Powerpoint Corp Overview2010 Ppsx
Ken Celebucki
 
PDF
PP Presentation 1-w Sound
korte1sl
 
DOCX
CV - Kavitha
Kavitha Ashok Menon
 
PPTX
Manual power point
travajo
 
PDF
Me[1][1]
emilymarvin33
 
PDF
Autonome voertuigen
JeroenVB
 
PPT
CCF SciVerse Update
colleeflower22
 
PDF
Advertentie Crescendo Venlo
ahavis
 
PDF
Final copy of sam mendes
lobomaniac
 
PDF
lezione n 1
Maria Buonocore
 
ODP
Nuevas tecnologias
oksanazayats
 
PPTX
Unleashing the potential of Consultative Selling by Shieny Aprilia
Agate Studio
 
PDF
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
CSCJournals
 
PDF
Iss
juliedk
 
PDF
Fonts & colors
Stacey
 
PDF
UTM JOURNAL
Mustafa El-sanfaz
 
Practica 5 ntae
oksanazayats
 
ประวัติ สุภัสสร
Supassron Thongnuch
 
Aprendizaje invertido
HENRY NELSON LEAL FERNANDEZ
 
Prime Star Powerpoint Corp Overview2010 Ppsx
Ken Celebucki
 
PP Presentation 1-w Sound
korte1sl
 
CV - Kavitha
Kavitha Ashok Menon
 
Manual power point
travajo
 
Me[1][1]
emilymarvin33
 
Autonome voertuigen
JeroenVB
 
CCF SciVerse Update
colleeflower22
 
Advertentie Crescendo Venlo
ahavis
 
Final copy of sam mendes
lobomaniac
 
lezione n 1
Maria Buonocore
 
Nuevas tecnologias
oksanazayats
 
Unleashing the potential of Consultative Selling by Shieny Aprilia
Agate Studio
 
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
CSCJournals
 
Iss
juliedk
 
Fonts & colors
Stacey
 
UTM JOURNAL
Mustafa El-sanfaz
 
Ad

Similar to NoSQL Tel Aviv Meetup#1: NoSQL Data Modeling (20)

PDF
Conceptual vs. Logical vs. Physical Data Modeling
DATAVERSITY
 
PDF
Data Modelling Zone 2019 - data modelling and JSON
George McGeachie
 
PPTX
Chapter 5: Data Development
Ahmed Alorage
 
PDF
chapter5-220725172250-dc425eb2.pdf
MahmoudSOLIMAN380726
 
PPTX
DA_01_Intro.pptx
Alok Mohapatra
 
PDF
MongoDB & NoSQL 101
Jollen Chen
 
PPTX
How to Survive as a Data Architect in a Polyglot Database World
Karen Lopez
 
PPTX
The Semantic Knowledge Graph
Trey Grainger
 
PDF
Steps towards business intelligence
Ahsan Kabir
 
PDF
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
barcelonajug
 
PDF
Introduction to azure document db
Antonios Chatzipavlis
 
PDF
Machine Learning and AI at Oracle
Sandesh Rao
 
PPTX
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
PDF
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Spark Summit
 
PDF
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
PDF
Working with MongoDB as MySQL DBA
Igor Donchovski
 
PDF
Neo4j in Depth
Max De Marzi
 
PDF
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
PPTX
Introducing DocumentDB
James Serra
 
PDF
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
Conceptual vs. Logical vs. Physical Data Modeling
DATAVERSITY
 
Data Modelling Zone 2019 - data modelling and JSON
George McGeachie
 
Chapter 5: Data Development
Ahmed Alorage
 
chapter5-220725172250-dc425eb2.pdf
MahmoudSOLIMAN380726
 
DA_01_Intro.pptx
Alok Mohapatra
 
MongoDB & NoSQL 101
Jollen Chen
 
How to Survive as a Data Architect in a Polyglot Database World
Karen Lopez
 
The Semantic Knowledge Graph
Trey Grainger
 
Steps towards business intelligence
Ahsan Kabir
 
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
barcelonajug
 
Introduction to azure document db
Antonios Chatzipavlis
 
Machine Learning and AI at Oracle
Sandesh Rao
 
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Spark Summit
 
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
Working with MongoDB as MySQL DBA
Igor Donchovski
 
Neo4j in Depth
Max De Marzi
 
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
Introducing DocumentDB
James Serra
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
Ad

Recently uploaded (20)

PPTX
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PDF
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
PPTX
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
PPTX
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PPTX
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
PPTX
Cost_of_Quality_Presentation_Software_Engineering.pptx
farispalayi
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPTX
internet básico presentacion es una red global
70965857
 
PPTX
Orchestrating things in Angular application
Peter Abraham
 
PPT
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
PDF
DevOps Design for different deployment options
henrymails
 
PPTX
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
PPTX
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
PPT
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
PPTX
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
PDF
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PPTX
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
Cost_of_Quality_Presentation_Software_Engineering.pptx
farispalayi
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
internet básico presentacion es una red global
70965857
 
Orchestrating things in Angular application
Peter Abraham
 
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
DevOps Design for different deployment options
henrymails
 
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 

NoSQL Tel Aviv Meetup#1: NoSQL Data Modeling

  • 2. IdoFriedman.yml Name: Ido Friedman, Past:”SQL Server consultant,Instructor,Team Leader” Present:”Data engineer and Architect, Elasticsearch,CouchBase,MongoDB,Python”,…] WorkPlace:Perion WhenNotWorking:@Sea
  • 3. Let’s talk • What is the role of data modeling • What does data modeling effect
  • 4. Data models Document Columnar Graph Relational New SQL* And more.. Document
  • 5. Data Domains On line Batch Real time analytics Micro batch Streaming
  • 7. Normalization/De-Normalization • Born to reserve storage and keep data integrity (RI) • Resolve data joining issues • Performance aspects Is it still relevant???
  • 8. Normalization example {"_index": "user_profiles", "_type": "properties", "_id": "25467834901804247006200168495554902214", "_version": 4, "_score": 1, "_source": { "app_package_id": 4495665825523018, "device_id": "b94b29c3-f03f-4e43-a646-53708e025779", "group_id": 876, "customer_user_id": "", "customer_device_id": "b2f5fbfb-5d05-01e9-32e9-a8b332e9a8b3", "advertise_id": "", "device_os": "android", "device_os_comparable_version": "00000004.00000002.00000002.00000017", "device_os_full_name": "android 4.2.2.17", "android_id": "c93334e4e246b83c", "manufacturer": "OPPO", "model": "R1001", "screen_width": 480, "screen_height": 800, "device_language": "vi", "cpu": "", "is_rooted": 1, "is_jailbroken": 0, "vendor": "perion", "app_installation_time": 1435173851000, "push_allowed": 1, "operator": "Beeline VN", "mcc": "452", "mnc": "07", "mac_address": "", "created_at": 1435174685000, "registered_in_desktop": "", "short_country_code": "VN", "updated_at": 1435259712000, "app_package_name": "com.gingersoftware.android.keyboard", "app_version_type": "" } } "manufacturer": "OPPO", {"_index": "events-2015-06-04", "_type": "events", "_id": "AU4-zg5nOG4dkiMGKrCx", "_version": 1, "_score": 1, "_source": { "numeric_value_unit": "key", "event_type_name": "custom", "text_value": "", "event_date": 1433440149000, "numeric_value": 13, "quantity": 0, "event_name": "Saved Tap", "numeric_value_name": "taps saved", "app_package_id": 1433440149000, "api_key": “asasa1w121", "device_id": "c5eb1fe0-8a77-41ef-9f79-ed7ed69d32e6" } } "event_name": "Saved Tap", "event_name": "SVD T", "event_name": "Saved TP", "event_name": "Saved",
  • 9. Constraints • No BIG Brother • Data can't be verified once it leaves the application
  • 11. Relations type • One to One • One to Many • Many to Many
  • 12. Relations example 1 to Many City : Person 1 to 1 Employee: Resume { _id:101, Name:Jason Voorhees, Age: 99 Resume_ID:1004 } { _id:1004, Jobs:[Cook] Education:[Knifery] Hobbies:[Murder] Employee_id:101 } { "name" : "Dam Square, Amsterdam", "location" : { "type" : "polygon", "coordinates" : [[ [ 4.89218, 52.37356 ], [ 4.89205, 52.37276 ], ….… ]]} } { _id:101, Name:Jason Voorhees, Age: 99 Resume_ID:1004 } Many to Many Student : Teacher { _id:101, Name:Jason Voorhees, Age: 99 Resume_ID:1004 Courses:[“Chainsaw 101”,”Axing”], Teachers:[101] } { _id:101, Name: “Freddy Krueger”, Age: 60 Resume_ID:1004 Skills: [{Skill:}] }
  • 13. Embedding • Embed • Known doc size • Data is highly related • No joins • Don’t Embed • Very large data sets • Data is updated rapidly
  • 14. Doesn’t fit …. • Use the data model that most fits your needs • Don’t be afraid of Polyglot Persistence
  • 15. Polyglot Persistence • Data usage patterns • Readers vs. Writers • Online vs. Batch • Concurrency • Issues • Data freshness • Data consistency • System Coupling