Azure Document Db

DotNetLombardia
Milano Fiori, Italy

 www.slideshare.net/marco.parenzan
 www.github.com/marcoparenzan
 marco [dot] parenzan [at] 1nn0va [dot] it
 www.1nnova.it
 @marco_parenzan
Formazione ,Divulgazione e Consulenza con 1nn0va
Microsoft MVP 2015 for Microsoft Azure
Cloud Architect, NET developer
Loves Functional Programming, Html5 Game Programming and Internet of Things AZURE
COMMUNITY
BOOTCAMP 2015
IoT Day - 08/05/2015
@1nn0va
#microservicesconf2015
9 Maggio 2015

Classic MVC
Business Logic
Contract BL/P
View
Controller

CQRS for IoT (Service Bus Powered)
Event Handler
UI
Event
Command Handler
Event
Device
Queue
Topics/Subscription
Event Hub
Write
Model
Read
/Search
Model

http://azure.microsoft.com/en-us/documentation/infographics/cloud-design-patterns/

IoT day 2015
Business, no longer data, is the foundation of software design
DDD!=OOP
Don’t start from Data
Data are not unique
No more ACID…ACID transactions are not useful with a
distributed model over different storages

IoT day 2015
Key/Value
Table
Blob
Queue
Graph
Document

IoT day 2015
try to treat your entities as self-contained documents represented in JSON
When working with relational databases, we've been taught for years to normalize, normalize,
normalize.
There are contains relationships between entities.
There are one-to-few relationships between entities.
There is embedded data that changes infrequently.
There is embedded data won't grow without bound.
There is embedded data that is integral to data in a document.
better read performance

IoT day 2015
Representing one-to-many relationships.
Representing many-to-many relationships.
Related data changes frequently.
Referenced data could be unbounded
Provides more flexibility than embedding
More round trips to read data
Normalizing typically provides better write performance

IoT day 2015
Promote code first development (mapping objects to json)
Resilient to iterative schema changes
Richer query and indexing (compared to KV stores)
Low impedance as object / JSON store; no ORM required
It just works
It’s fast

IoT day 2015
a container of JSON documents and the associated JavaScript
application logic
JSON docs inside of a collection can vary dramatically
A unit of scale for transaction and query throughput (capacity
units allocated uniformly across all collections)
A unit of scale for capacity
A unit of replication

IoT day 2015
Collections in DocumentDB are not just logical containers, but
also physical containers
They are the transaction boundary for stored procedures and triggers
entry point to queries and CRUD operations
Each collection is assigned a reserved amount of throughput which is
not shared with other collections in the same account
Collections do not enforce schema

IoT day 2015
In hash partitioning, partitions are assigned based on the value
of a hash function, allowing you to evenly distribute requests
and data across a number of partitions. This is commonly used
to partition data produced or consumed from a large number
of distinct clients, and is useful for storing user profiles, catalog
items, and IoT ("Internet of Things") telemetry data.
Evenly distribute across n number of partitions (algorithmic) ….

IoT day 2015
In range partitioning, partitions are assigned based on whether
the partition key is within a certain range
This is commonly used for partitioning with time
stamp properties
Keep current data hot, Warm historical data, Scale-down older
data, Purge / Archive

IoT day 2015
In lookup partitioning, partitions are assigned based on a lookup
map that assigns discrete partition values to specific partitions a.k.a. a
partition or shard map
This is commonly used for partitioning by region
Home tenant / user to a specific partition. Use "master" lookup.
Cache this shard map to avoid making the lookup the bottleneck
Tenant Partition Id
Customer 1
Big Customer 2
Another 3

IoT day 2015
Query / transaction throughput (and reliability – i.e., hardware
failure) depend on replication!
All writes to the primary are replicated across two secondary replicas
All reads are distributed across three copies
“Scalability of throughput” – allowing different clients to read from different replicas
helps prevent bottlenecks
BUT replication takes time!
Potential scenario: some clients are
reading while another is writing
Now, the data is out-of-date, inconsistent!

IoT day 2015
Trade-off: speed (performance & availability) or consistency
(data correctness)?
“Does every read need the MOST current data?”
“Or do I need every request to be handled and handled quickly?”
No “one size fits all” answer … so it’s up to you!
4 options …
For the entire Db…
…In a future release, we intend to support overriding the default consistency level
on a per collection basis.

IoT day 2015
client always sees completely consistent data
Slowest reads / writes
Mission critical: e.x. stock market, banking, airline reservation

IoT day 2015
Default – even trade-off between performance & availability vs.
data correctness
client reads its own writes, but other clients reading this same
data might see older values

IoT day 2015
client might see old data, but it can specify a limit for how old
that data can be (ex. 2 seconds)
Updates happen in order received
similar to Session consistency, but speeds up reads while still
preserving the order of updates

IoT day 2015
client might see old data for as long as it takes a write to
propagate to all replicas
High performance & availability, but a client might sometimes
read out-of-date information or see updates out of order

IoT day 2015
At the database level (see preview portal)
On a per-read or per-query basis (optional parameter on
CreateDocumentQuery method)

IoT day 2015
Use Weaker Consistency Levels for better Read latencies
IoT
Data Analysis
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/

https://github.com/marcoparenzan/CSharpDay2015https://github.com/marcoparenzan/CSharpDay2015
DotNetLombardia
Milano Fiori, Italy

Azure Document Db

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Azure Document Db (20)

More from Marco Parenzan (20)

Recently uploaded (20)

Azure Document Db