SlideShare a Scribd company logo
DotNetLombardia
Milano Fiori, Italy
 www.slideshare.net/marco.parenzan
 www.github.com/marcoparenzan
 marco [dot] parenzan [at] 1nn0va [dot] it
 www.1nnova.it
 @marco_parenzan
Formazione ,Divulgazione e Consulenza con 1nn0va
Microsoft MVP 2015 for Microsoft Azure
Cloud Architect, NET developer
Loves Functional Programming, Html5 Game Programming and Internet of Things AZURE
COMMUNITY
BOOTCAMP 2015
IoT Day - 08/05/2015
@1nn0va
#microservicesconf2015
9 Maggio 2015
Classic MVC
Business Logic
Contract BL/P
View
Controller
CQRS for IoT (Service Bus Powered)
Event Handler
UI
Event
Command Handler
Event
Device
Queue
Topics/Subscription
Event Hub
Write
Model
Read
/Search
Model
The traditional world
http://azure.microsoft.com/en-us/documentation/infographics/cloud-design-patterns/
IoT day 2015
Business, no longer data, is the foundation of software design
DDD!=OOP
Don’t start from Data
Data are not unique
No more ACID…ACID transactions are not useful with a
distributed model over different storages
Azure Document Db
IoT day 2015
Key/Value
Table
Blob
Queue
Graph
Document
IoT day 2015
try to treat your entities as self-contained documents represented in JSON
When working with relational databases, we've been taught for years to normalize, normalize,
normalize.
There are contains relationships between entities.
There are one-to-few relationships between entities.
There is embedded data that changes infrequently.
There is embedded data won't grow without bound.
There is embedded data that is integral to data in a document.
better read performance
IoT day 2015
Representing one-to-many relationships.
Representing many-to-many relationships.
Related data changes frequently.
Referenced data could be unbounded
Provides more flexibility than embedding
More round trips to read data
Normalizing typically provides better write performance
IoT day 2015
Promote code first development (mapping objects to json)
Resilient to iterative schema changes
Richer query and indexing (compared to KV stores)
Low impedance as object / JSON store; no ORM required
It just works
It’s fast
IoT day 2015
a container of JSON documents and the associated JavaScript
application logic
JSON docs inside of a collection can vary dramatically
A unit of scale for transaction and query throughput (capacity
units allocated uniformly across all collections)
A unit of scale for capacity
A unit of replication
IoT day 2015
Collections in DocumentDB are not just logical containers, but
also physical containers
They are the transaction boundary for stored procedures and triggers
entry point to queries and CRUD operations
Each collection is assigned a reserved amount of throughput which is
not shared with other collections in the same account
Collections do not enforce schema
Partitioning
IoT day 2015
In hash partitioning, partitions are assigned based on the value
of a hash function, allowing you to evenly distribute requests
and data across a number of partitions. This is commonly used
to partition data produced or consumed from a large number
of distinct clients, and is useful for storing user profiles, catalog
items, and IoT ("Internet of Things") telemetry data.
Evenly distribute across n number of partitions (algorithmic) ….
IoT day 2015
In range partitioning, partitions are assigned based on whether
the partition key is within a certain range
This is commonly used for partitioning with time
stamp properties
Keep current data hot, Warm historical data, Scale-down older
data, Purge / Archive
IoT day 2015
In lookup partitioning, partitions are assigned based on a lookup
map that assigns discrete partition values to specific partitions a.k.a. a
partition or shard map
This is commonly used for partitioning by region
Home tenant / user to a specific partition. Use "master" lookup.
Cache this shard map to avoid making the lookup the bottleneck
Tenant Partition Id
Customer 1
Big Customer 2
Another 3
Consistency
IoT day 2015
Query / transaction throughput (and reliability – i.e., hardware
failure) depend on replication!
All writes to the primary are replicated across two secondary replicas
All reads are distributed across three copies
“Scalability of throughput” – allowing different clients to read from different replicas
helps prevent bottlenecks
BUT replication takes time!
Potential scenario: some clients are
reading while another is writing
Now, the data is out-of-date, inconsistent!
IoT day 2015
Trade-off: speed (performance & availability) or consistency
(data correctness)?
“Does every read need the MOST current data?”
“Or do I need every request to be handled and handled quickly?”
No “one size fits all” answer … so it’s up to you!
4 options …
For the entire Db…
…In a future release, we intend to support overriding the default consistency level
on a per collection basis.
IoT day 2015
client always sees completely consistent data
Slowest reads / writes
Mission critical: e.x. stock market, banking, airline reservation
IoT day 2015
Default – even trade-off between performance & availability vs.
data correctness
client reads its own writes, but other clients reading this same
data might see older values
IoT day 2015
client might see old data, but it can specify a limit for how old
that data can be (ex. 2 seconds)
Updates happen in order received
similar to Session consistency, but speeds up reads while still
preserving the order of updates
IoT day 2015
client might see old data for as long as it takes a write to
propagate to all replicas
High performance & availability, but a client might sometimes
read out-of-date information or see updates out of order
IoT day 2015
At the database level (see preview portal)
On a per-read or per-query basis (optional parameter on
CreateDocumentQuery method)
IoT day 2015
Use Weaker Consistency Levels for better Read latencies
IoT
Data Analysis
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/
https://github.com/marcoparenzan/CSharpDay2015https://github.com/marcoparenzan/CSharpDay2015
DotNetLombardia
Milano Fiori, Italy

More Related Content

What's hot (20)

PPTX
NoSQL, which way to go?
Ahmed Elharouny
 
PPTX
Azure Stream Analytics
Marco Parenzan
 
PPTX
Implementing a canonical IoT backend in Azure with Azure Stream Analytics
Marco Parenzan
 
PPTX
Power BI for Big Data and the New Look of Big Data Solutions
James Serra
 
PPTX
Data virtualization, Data Federation & IaaS with Jboss Teiid
Anil Allewar
 
PDF
Point of View to Accelerate with dev ops
Sanjay B. Bhakta
 
PDF
Domain Driven Data: Apache Kafka® and the Data Mesh
confluent
 
PDF
Introduction to Azure Data Factory
Slava Kokaev
 
PPTX
Transitioning to a BI Role
James Serra
 
PPTX
Overview on Azure Machine Learning
James Serra
 
PPTX
Designing big data analytics solutions on azure
Mohamed Tawfik
 
PPTX
Data & AI Platform Concepts
Ankit Rathi
 
PDF
Azure Big data
Michel HUBERT
 
PPTX
Building a modern data warehouse
James Serra
 
PDF
An Introduction to Data Virtualization in 2018
Denodo
 
PPTX
Introduction to Azure Stream Analytics
Slava Kokaev
 
PDF
Encompassing Information Integration
nguyenfilip
 
PPTX
Power BI: Tips and Tricks
GlobalLogic Ukraine
 
PDF
SQL Server 2014 Faster Insights from Any Data
Stéphane Fréchette
 
PPTX
Data Management Gateway - Deep Dive
Jean-Pierre Riehl
 
NoSQL, which way to go?
Ahmed Elharouny
 
Azure Stream Analytics
Marco Parenzan
 
Implementing a canonical IoT backend in Azure with Azure Stream Analytics
Marco Parenzan
 
Power BI for Big Data and the New Look of Big Data Solutions
James Serra
 
Data virtualization, Data Federation & IaaS with Jboss Teiid
Anil Allewar
 
Point of View to Accelerate with dev ops
Sanjay B. Bhakta
 
Domain Driven Data: Apache Kafka® and the Data Mesh
confluent
 
Introduction to Azure Data Factory
Slava Kokaev
 
Transitioning to a BI Role
James Serra
 
Overview on Azure Machine Learning
James Serra
 
Designing big data analytics solutions on azure
Mohamed Tawfik
 
Data & AI Platform Concepts
Ankit Rathi
 
Azure Big data
Michel HUBERT
 
Building a modern data warehouse
James Serra
 
An Introduction to Data Virtualization in 2018
Denodo
 
Introduction to Azure Stream Analytics
Slava Kokaev
 
Encompassing Information Integration
nguyenfilip
 
Power BI: Tips and Tricks
GlobalLogic Ukraine
 
SQL Server 2014 Faster Insights from Any Data
Stéphane Fréchette
 
Data Management Gateway - Deep Dive
Jean-Pierre Riehl
 

Viewers also liked (20)

PDF
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
NoSQLmatters
 
PPTX
Cortana Analytics Suite
James Serra
 
PPTX
Azure document DB
Sasha-Leigh Garret
 
PDF
BIPD Tech Tuesday Presentation - Qubole
Qubole
 
PPTX
Azure stream analytics by Nico Jacobs
ITProceed
 
PDF
Creating a fortigate vpn network & security blog
Kamlesh Mishra Sr. Executive - IT Infra "IT infra Lead"
 
PPTX
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole
 
PPTX
Azure ARM’d and Ready
mscug
 
PDF
AIMeetup #3: Cortana intelligence suite - tchnij życie w swoje dane
2040.io
 
PDF
Qubole hadoop-summit-2013-europe
Joydeep Sen Sarma
 
PPTX
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 
PDF
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
PDF
RDO-Packstack Workshop
Thamrongtawal Hashim
 
PDF
5 Crucial Considerations for Big data adoption
Qubole
 
PPTX
Qubole @ AWS Meetup Bangalore - July 2015
Joydeep Sen Sarma
 
PPTX
Atlanta Data Science Meetup | Qubole slides
Qubole
 
PPTX
Nw qubole overview_033015
Michael Mersch
 
PPTX
DataXu: Programmatic Premium Webinar - June 7, 2012
dataxu
 
PDF
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
PPTX
15 Years of Web Security: The Rebellious Teenage Years
Jeremiah Grossman
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
NoSQLmatters
 
Cortana Analytics Suite
James Serra
 
Azure document DB
Sasha-Leigh Garret
 
BIPD Tech Tuesday Presentation - Qubole
Qubole
 
Azure stream analytics by Nico Jacobs
ITProceed
 
Creating a fortigate vpn network & security blog
Kamlesh Mishra Sr. Executive - IT Infra "IT infra Lead"
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole
 
Azure ARM’d and Ready
mscug
 
AIMeetup #3: Cortana intelligence suite - tchnij życie w swoje dane
2040.io
 
Qubole hadoop-summit-2013-europe
Joydeep Sen Sarma
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
RDO-Packstack Workshop
Thamrongtawal Hashim
 
5 Crucial Considerations for Big data adoption
Qubole
 
Qubole @ AWS Meetup Bangalore - July 2015
Joydeep Sen Sarma
 
Atlanta Data Science Meetup | Qubole slides
Qubole
 
Nw qubole overview_033015
Michael Mersch
 
DataXu: Programmatic Premium Webinar - June 7, 2012
dataxu
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
15 Years of Web Security: The Rebellious Teenage Years
Jeremiah Grossman
 
Ad

Similar to Azure Document Db (20)

PDF
Digital_IOT_(Microsoft_Solution).pdf
ssuserd23711
 
PDF
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
PDF
Google's Infrastructure and Specific IoT Services
Intel® Software
 
PDF
Data Virtualization: Introduction and Business Value (UK)
Denodo
 
PDF
Data virtualization an introduction
Denodo
 
PDF
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
PDF
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Igor De Souza
 
PDF
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
PDF
Next Gen Analytics Going Beyond Data Warehouse
Denodo
 
PDF
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Sparity1
 
PDF
Streaming is a Detail
HostedbyConfluent
 
PDF
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
YogeshIJTSRD
 
PDF
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
PDF
Data Virtualization: An Introduction
Denodo
 
PDF
Data Virtualization to Survive a Multi and Hybrid Cloud World
Denodo
 
PDF
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
PPTX
MongoDB
Stefano Coratti
 
PPTX
Windows Azure Platform - Jonathan Wong
Spiffy
 
PPTX
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
PPTX
Microsoft Azure News - 2018 March
Daniel Toomey
 
Digital_IOT_(Microsoft_Solution).pdf
ssuserd23711
 
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Google's Infrastructure and Specific IoT Services
Intel® Software
 
Data Virtualization: Introduction and Business Value (UK)
Denodo
 
Data virtualization an introduction
Denodo
 
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Igor De Souza
 
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Next Gen Analytics Going Beyond Data Warehouse
Denodo
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Sparity1
 
Streaming is a Detail
HostedbyConfluent
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
YogeshIJTSRD
 
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
Data Virtualization: An Introduction
Denodo
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Denodo
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Windows Azure Platform - Jonathan Wong
Spiffy
 
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
Microsoft Azure News - 2018 March
Daniel Toomey
 
Ad

More from Marco Parenzan (20)

PPTX
Azure IoT Central per lo SCADA engineer
Marco Parenzan
 
PPTX
Azure Hybrid @ Home
Marco Parenzan
 
PPTX
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Marco Parenzan
 
PPTX
Azure Synapse Analytics for your IoT Solutions
Marco Parenzan
 
PPTX
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
PPTX
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
PPTX
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
PPTX
Developing Actors in Azure with .net
Marco Parenzan
 
PPTX
Math with .NET for you and Azure
Marco Parenzan
 
PPTX
Power BI data flow and Azure IoT Central
Marco Parenzan
 
PPTX
.net for fun: write a Christmas videogame
Marco Parenzan
 
PPTX
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Marco Parenzan
 
PPTX
Anomaly Detection with Azure and .NET
Marco Parenzan
 
PPTX
Deploy Microsoft Azure Data Solutions
Marco Parenzan
 
PPTX
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan
 
PPTX
Azure IoT Central
Marco Parenzan
 
PPTX
Anomaly Detection with Azure and .net
Marco Parenzan
 
PPTX
Code Generation for Azure with .net
Marco Parenzan
 
PPTX
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Marco Parenzan
 
PPTX
Time Series Anomaly Detection with Azure and .NETT
Marco Parenzan
 
Azure IoT Central per lo SCADA engineer
Marco Parenzan
 
Azure Hybrid @ Home
Marco Parenzan
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Marco Parenzan
 
Azure Synapse Analytics for your IoT Solutions
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Developing Actors in Azure with .net
Marco Parenzan
 
Math with .NET for you and Azure
Marco Parenzan
 
Power BI data flow and Azure IoT Central
Marco Parenzan
 
.net for fun: write a Christmas videogame
Marco Parenzan
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Marco Parenzan
 
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deploy Microsoft Azure Data Solutions
Marco Parenzan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan
 
Azure IoT Central
Marco Parenzan
 
Anomaly Detection with Azure and .net
Marco Parenzan
 
Code Generation for Azure with .net
Marco Parenzan
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Marco Parenzan
 
Time Series Anomaly Detection with Azure and .NETT
Marco Parenzan
 

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
July Patch Tuesday
Ivanti
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
July Patch Tuesday
Ivanti
 

Azure Document Db

  • 2.  www.slideshare.net/marco.parenzan  www.github.com/marcoparenzan  marco [dot] parenzan [at] 1nn0va [dot] it  www.1nnova.it  @marco_parenzan Formazione ,Divulgazione e Consulenza con 1nn0va Microsoft MVP 2015 for Microsoft Azure Cloud Architect, NET developer Loves Functional Programming, Html5 Game Programming and Internet of Things AZURE COMMUNITY BOOTCAMP 2015 IoT Day - 08/05/2015 @1nn0va #microservicesconf2015 9 Maggio 2015
  • 3. Classic MVC Business Logic Contract BL/P View Controller
  • 4. CQRS for IoT (Service Bus Powered) Event Handler UI Event Command Handler Event Device Queue Topics/Subscription Event Hub Write Model Read /Search Model
  • 7. IoT day 2015 Business, no longer data, is the foundation of software design DDD!=OOP Don’t start from Data Data are not unique No more ACID…ACID transactions are not useful with a distributed model over different storages
  • 10. IoT day 2015 try to treat your entities as self-contained documents represented in JSON When working with relational databases, we've been taught for years to normalize, normalize, normalize. There are contains relationships between entities. There are one-to-few relationships between entities. There is embedded data that changes infrequently. There is embedded data won't grow without bound. There is embedded data that is integral to data in a document. better read performance
  • 11. IoT day 2015 Representing one-to-many relationships. Representing many-to-many relationships. Related data changes frequently. Referenced data could be unbounded Provides more flexibility than embedding More round trips to read data Normalizing typically provides better write performance
  • 12. IoT day 2015 Promote code first development (mapping objects to json) Resilient to iterative schema changes Richer query and indexing (compared to KV stores) Low impedance as object / JSON store; no ORM required It just works It’s fast
  • 13. IoT day 2015 a container of JSON documents and the associated JavaScript application logic JSON docs inside of a collection can vary dramatically A unit of scale for transaction and query throughput (capacity units allocated uniformly across all collections) A unit of scale for capacity A unit of replication
  • 14. IoT day 2015 Collections in DocumentDB are not just logical containers, but also physical containers They are the transaction boundary for stored procedures and triggers entry point to queries and CRUD operations Each collection is assigned a reserved amount of throughput which is not shared with other collections in the same account Collections do not enforce schema
  • 16. IoT day 2015 In hash partitioning, partitions are assigned based on the value of a hash function, allowing you to evenly distribute requests and data across a number of partitions. This is commonly used to partition data produced or consumed from a large number of distinct clients, and is useful for storing user profiles, catalog items, and IoT ("Internet of Things") telemetry data. Evenly distribute across n number of partitions (algorithmic) ….
  • 17. IoT day 2015 In range partitioning, partitions are assigned based on whether the partition key is within a certain range This is commonly used for partitioning with time stamp properties Keep current data hot, Warm historical data, Scale-down older data, Purge / Archive
  • 18. IoT day 2015 In lookup partitioning, partitions are assigned based on a lookup map that assigns discrete partition values to specific partitions a.k.a. a partition or shard map This is commonly used for partitioning by region Home tenant / user to a specific partition. Use "master" lookup. Cache this shard map to avoid making the lookup the bottleneck Tenant Partition Id Customer 1 Big Customer 2 Another 3
  • 20. IoT day 2015 Query / transaction throughput (and reliability – i.e., hardware failure) depend on replication! All writes to the primary are replicated across two secondary replicas All reads are distributed across three copies “Scalability of throughput” – allowing different clients to read from different replicas helps prevent bottlenecks BUT replication takes time! Potential scenario: some clients are reading while another is writing Now, the data is out-of-date, inconsistent!
  • 21. IoT day 2015 Trade-off: speed (performance & availability) or consistency (data correctness)? “Does every read need the MOST current data?” “Or do I need every request to be handled and handled quickly?” No “one size fits all” answer … so it’s up to you! 4 options … For the entire Db… …In a future release, we intend to support overriding the default consistency level on a per collection basis.
  • 22. IoT day 2015 client always sees completely consistent data Slowest reads / writes Mission critical: e.x. stock market, banking, airline reservation
  • 23. IoT day 2015 Default – even trade-off between performance & availability vs. data correctness client reads its own writes, but other clients reading this same data might see older values
  • 24. IoT day 2015 client might see old data, but it can specify a limit for how old that data can be (ex. 2 seconds) Updates happen in order received similar to Session consistency, but speeds up reads while still preserving the order of updates
  • 25. IoT day 2015 client might see old data for as long as it takes a write to propagate to all replicas High performance & availability, but a client might sometimes read out-of-date information or see updates out of order
  • 26. IoT day 2015 At the database level (see preview portal) On a per-read or per-query basis (optional parameter on CreateDocumentQuery method)
  • 27. IoT day 2015 Use Weaker Consistency Levels for better Read latencies IoT Data Analysis http://azure.microsoft.com/blog/2015/01/27/performance-tips- for-azure-documentdb-part-2/