Azure DocumentDB 
Neil Mackenzie 
Satory Global , LLC
Who Am I 
• Neil Mackenzie 
• Azure Lead –Satory Global 
• @mknz 
• http://convective.wordpress.com 
• Author: Microsoft Windows Azure Development Cookbook 
• Microsoft MVP for Azure
Agenda 
• DocumentDB Overview 
• .NET Development
DOCUMENTDB 
OVERVIEW
Core Features 
• Schema-less, NoSQL document database 
• Fully managed, with provisioned capacity 
• Stored entities are JSON documents 
• Tunable consistency 
• Designed to scale into petabytes
Microsoft Databases in Azure 
• Relational 
• SQL Database (PaaS) 
• SQL Server (IaaS) 
• NoSQL 
• Azure Tables – key-value store 
• Azure DocumentDB – document database
Resource Model 
• Database Account 
• Database 
• Collection 
• Document 
• Attachment 
• Stored Procedure 
• Trigger 
• User-defined functions 
• User 
• Permission 
• Media
Resource Addressing 
• Core interface to DocumentDB is RESTful 
• Each resource has a permanent unique ID 
• API URL: 
• https://{database account}.documents.azure.com 
• Document Path: 
• /dbs/{database id}/colls/{collection id}/docs/{document id} 
• Example full URL for a document: 
• https://lochinver.documents.azure.com/dbs/ju1TAA==/colls/ju1TAPhIFAA=/docs/ju1TAP 
hIFAAJAAAAAAAAAA==
Operations 
• For each resource: 
• Create 
• Replace 
• Delete 
• Read 
• Query 
• Read is a GET Operation on a specified resource ID, returning a single resource. 
• Query is a POST operation on a collection with a request body containing 
DocumentDB SQL text, returning a possible empty collection of resources. 
• Query can filter only on indexed properties
DocumentDB SQL 
• SELECT <select-list> FROM <from-specification> WHERE <filter-condition> 
• Similar to normal SQL 
• Only self-join supported 
• Ability to reach into JSON tree to: 
• Access values for filter condition 
• Shape select list 
• User-defined functions 
• LINQ-to-SQL support for .NET
Consistency Levels 
• Default configured for database account, overridable (down) at request level. 
• Strong – write only visible after quorum commit. Quorum reads. 
• Bounded Staleness – write order guaranteed. Quorum reads may be behind by a 
specified number of operations (or time in seconds). 
• Session – write-order guaranteed within a client session. Reads are up-to-date 
within the session. “Usually sufficient.” (Default for a new database account) 
• Eventual – reads may be out of sequence.
Indexing Policy 
• Specified at the collection level 
• Automatic indexing 
• By default all properties indexed automatically. This is tunable for individual documents 
and paths within a document – either inclusion or exclusion of a path 
• Index precision can be specified for strings and numbers 
• Indexing mode 
• Consistent – By default indexes synchronously updated on insert, replace or delete 
• Lazy – asynchronous index update (targeted at bulk ingestion)
Performance 
• Capacity Unit 
• Specified amount of storage capacity and operational throughput 
• Collection quota per capacity unit 
• Provisioning unit for scaleout for both performance and storage 
• Configured at the database account level 
• Sharable among all databases and collections in the database account 
• Preview limit is 10GB, 3 collections per capacity unit 
• Storage is SSD backed 
• Microsoft has used databases with terabytes of storage (designed for petabytes)
Performance – Scalability Targets 
• Assumptions: 
• 1KB document with 10 properties 
• Session consistency level 
• Automatic indexing 
Database Operation Operations / second (Request units) 
Read 1 document 2000 
Insert, replace, update 1 document 500 
Simple query (returning 1 document) 1000 
Stored procedure with 50 inserts 20 
• Requests throttled if consumption exceeds overall capacity unit target
Stored Procedures,Triggers and UDFs 
• DocumentDB supports server-side JavaScript 
• Stored Procedures: 
• Registered at collection level 
• Operate on any document in the collection 
• Invoked inside transaction context on primary replica 
• Triggers: 
• Pre- or Post: create, replace or delete operations 
• Invoked inside transaction context on primary replica 
• User-Defined Functions 
• Scalar functions invoked only inside queries
Libraries 
• .NET API 
• Node.js 
• JavaScript client 
• JavaScript server 
• Python
Preview 
• Azure DocumentDB available in: 
• West US 
• North Europe 
• West Europe 
• Price: $0.73 /day, $22.50 / month – includes 50% preview discount
Management 
• DocumentDB is supported only in the new portal 
• Manage database account, collections, users, etc. 
• View consumption statistics 
• https://portal.azure.com 
• API support to manage DocumentDB resources 
• Be aware of limits: 
• e.g., 3 collections per database account
.NET DEVELOPMENT
RESTful API 
• Core interface to DocumentDB 
• Used by all client libraries 
• Standard operations against all DocumentDB resources: 
• CREATE, DELETE, PUT, GET, POST 
• Returns permanent resource URL on creation 
• HMAC authentication using management or resource key 
• DocumentDB request headers
Download 
• .NET API hosted on NuGet 
• Install-Package Microsoft.Azure.Documents.Client –Pre 
• Installs DocumentDB and JSON.NET packages
Class: DocumentClient 
• Constructed with endpoint URL and management key for Database account 
• Provides async/await methods for CRUD operations on DocumentDB resources 
• Manages the connection to DocumentDB 
// Create DocumentClient 
String documentDbAddress = 
"https://{account}.documents.azure.com"; 
String authorizationKey = "key=="; 
Uri documentDbUri = new Uri(documentDbAddress); 
DocumentClient documentClient = 
new DocumentClient(documentDbUri, authorizationKey);
Class: Resource 
• Base class for all DocumentDB resource classes 
• Exposes: 
• ETag - used for optimistic concurrency 
• SelfLink – URL path for resource 
• ResourceID – internal ID (base64 encoded) for resource 
• ID – ID of the resource, either provided or generated
Class: Database 
• Derived from Resource 
• Adds properties exposing collections and users 
// Create database 
Database database = new Database { Id = databaseId }; 
ResourceResponse<Database> response = await 
documentClient.CreateDatabaseAsync(database); 
database = response; 
String selfLink = database.SelfLink; 
String collections = database.CollectionsLink; 
String users = database.UsersLink;
Class: DocumentCollection 
• Derived from Resource 
• Adds properties exposing DocumentsLink, StoredProceduresLink, TriggersLink, 
UserDefinedFunctionsLink 
// Create document collection 
DocumentCollection documentCollection = 
new DocumentCollection { Id = "SomeId" }; 
ResourceResponse<DocumentCollection> response = await 
documentClient.CreateDocumentCollectionAsync( 
database.SelfLink, documentCollection); 
documentCollection = response;
Data Model 
• Uses JSON.NET library for serialization 
• Simple class 
• No special base class 
• All public properties are serialized into JSON 
• Obvious mapping from.NET to JSON 
• IList, etc. -> Array 
• Int32, etc. -> Integer 
• Float, etc. -> Float 
• DateTime -> String 
• Byte[] -> String
Class: Document 
• Derived from Resource 
• Adds property exposing AttachmentsLink 
// Insert document 
ResourceResponse<Document> response = await 
documentClient.CreateDocumentAsync( 
documentCollection.SelfLink, someDocumentEntity); 
Document document = response;
Class: ResourceResponse<T> 
• Encapsulates the response from a DocumentDB resource operation 
• Provides resource-dependent quota and usage information 
• Contains the response headers including HTTP StatusCode 
• Implicitly exposes the typed resource from the response
Read 
• A Read operation returns a single document. 
ResourceResponse<Document> response = 
await documentClient.ReadDocumentAsync(documentLink); 
Album album = 
JsonConvert.DeserializeObject<Album>(response.Resource.ToString());
Delete 
Album album = new Album() { 
AlbumName = "Let It Bleed", 
BandName = "Rolling Stones", 
ReleaseYear = "1969“ 
}; 
Document document = await 
documentClient.CreateDocumentAsync( 
documentCollection.SelfLink, album); 
ResourceResponse<Document> secondResponse = await 
documentClient.DeleteDocumentAsync( 
document.SelfLink);
Replace 
dynamic readResponse = await 
documentClient.ReadDocumentAsync(documentLink); 
RequestOptions requestOptions = new RequestOptions() { 
AccessCondition = new AccessCondition() { 
Type = AccessConditionType.IfMatch, 
Condition = readResponse.Resource.ETag 
} 
}; 
Album album = (Album)readResponse.Resource; 
album.ReleaseYear = "1990"; 
ResourceResponse<Document> replaceResponse = await 
documentClient.ReplaceDocumentAsync( 
documentLink, album, requestOptions);
Read From a Feed 
• The .NET API can return all the resources in a collection as a paged “feed.” 
String continuation = String.Empty; 
Do { 
FeedOptions feedOptions = new FeedOptions { 
MaxItemCount = 10, 
RequestContinuation = continuation 
}; 
FeedResponse<dynamic> response = await 
documentClient.ReadDocumentFeedAsync( 
documentCollectionLink, feedOptions); 
continuation = response.ResponseContinuation; 
} while (!String.IsNullOrEmpty(continuation));
DocumentDB Queries 
• DocumentDB supports queries at all resource levels, including: 
• Database, DocumentCollection, and Document 
• .NET API supports the following types of queries 
• SQL 
• LINQ SQL 
• LINQ Lambda 
• The DocumentQueryable class exposes helper extension methods to create 
various types of query
SQL Query 
foreach (var album in documentClient.CreateDocumentQuery<Album>( 
documentCollection.SelfLink, 
"SELECT * FROM albums a WHERE a.bandName = 'Radiohead'")) { 
Console.WriteLine("Album name: {0}", album.AlbumName); 
} 
Note that albums is the name of the DocumentDB collection
LINQ Query 
IQueryable<Album> albums = 
from a in documentClient.CreateDocumentQuery<Album>( 
documentCollection.SelfLink) 
where a.BandName == "Radiohead" 
select a; 
foreach (var album in albums) { 
Console.WriteLine("Album name: {0}", album.AlbumName) 
}
LINQ LambaWith Paging 
FeedOptions feedOptions = new FeedOptions() { 
MaxItemCount = 10 
}; 
var query = documentClient.CreateDocumentQuery<Album>( 
documentCollection.SelfLink, feedOptions) 
.Where(a => a.BandName == "Radiohead") 
.AsDocumentQuery(); 
do { 
foreach (Album album in await query.ExecuteNextAsync()) { 
Console.WriteLine("Album name: {0}", album.AlbumName); 
} 
} while (query.HasMoreResults);
Summary 
• Azure DocumentDB Preview 
• Fully managed document database storing JSON entities 
• High scale and performance 
• Wide variety of client libraries 
• .NET, Node.js, JavaScript, python 
• Supported only in the new Azure portal
Resources 
• Documentation: 
• http://documentdb.com 
• Azure Portal 
• http://portal.azure.com 
• Channel 9 Show on DocumentDB 
• http://channel9.msdn.com/Shows/Data-Exposed/Introduction-to-Azure-DocumentDB

Azure DocumentDB

  • 1.
    Azure DocumentDB NeilMackenzie Satory Global , LLC
  • 2.
    Who Am I • Neil Mackenzie • Azure Lead –Satory Global • @mknz • http://convective.wordpress.com • Author: Microsoft Windows Azure Development Cookbook • Microsoft MVP for Azure
  • 3.
    Agenda • DocumentDBOverview • .NET Development
  • 4.
  • 5.
    Core Features •Schema-less, NoSQL document database • Fully managed, with provisioned capacity • Stored entities are JSON documents • Tunable consistency • Designed to scale into petabytes
  • 6.
    Microsoft Databases inAzure • Relational • SQL Database (PaaS) • SQL Server (IaaS) • NoSQL • Azure Tables – key-value store • Azure DocumentDB – document database
  • 7.
    Resource Model •Database Account • Database • Collection • Document • Attachment • Stored Procedure • Trigger • User-defined functions • User • Permission • Media
  • 8.
    Resource Addressing •Core interface to DocumentDB is RESTful • Each resource has a permanent unique ID • API URL: • https://{database account}.documents.azure.com • Document Path: • /dbs/{database id}/colls/{collection id}/docs/{document id} • Example full URL for a document: • https://lochinver.documents.azure.com/dbs/ju1TAA==/colls/ju1TAPhIFAA=/docs/ju1TAP hIFAAJAAAAAAAAAA==
  • 9.
    Operations • Foreach resource: • Create • Replace • Delete • Read • Query • Read is a GET Operation on a specified resource ID, returning a single resource. • Query is a POST operation on a collection with a request body containing DocumentDB SQL text, returning a possible empty collection of resources. • Query can filter only on indexed properties
  • 10.
    DocumentDB SQL •SELECT <select-list> FROM <from-specification> WHERE <filter-condition> • Similar to normal SQL • Only self-join supported • Ability to reach into JSON tree to: • Access values for filter condition • Shape select list • User-defined functions • LINQ-to-SQL support for .NET
  • 11.
    Consistency Levels •Default configured for database account, overridable (down) at request level. • Strong – write only visible after quorum commit. Quorum reads. • Bounded Staleness – write order guaranteed. Quorum reads may be behind by a specified number of operations (or time in seconds). • Session – write-order guaranteed within a client session. Reads are up-to-date within the session. “Usually sufficient.” (Default for a new database account) • Eventual – reads may be out of sequence.
  • 12.
    Indexing Policy •Specified at the collection level • Automatic indexing • By default all properties indexed automatically. This is tunable for individual documents and paths within a document – either inclusion or exclusion of a path • Index precision can be specified for strings and numbers • Indexing mode • Consistent – By default indexes synchronously updated on insert, replace or delete • Lazy – asynchronous index update (targeted at bulk ingestion)
  • 13.
    Performance • CapacityUnit • Specified amount of storage capacity and operational throughput • Collection quota per capacity unit • Provisioning unit for scaleout for both performance and storage • Configured at the database account level • Sharable among all databases and collections in the database account • Preview limit is 10GB, 3 collections per capacity unit • Storage is SSD backed • Microsoft has used databases with terabytes of storage (designed for petabytes)
  • 14.
    Performance – ScalabilityTargets • Assumptions: • 1KB document with 10 properties • Session consistency level • Automatic indexing Database Operation Operations / second (Request units) Read 1 document 2000 Insert, replace, update 1 document 500 Simple query (returning 1 document) 1000 Stored procedure with 50 inserts 20 • Requests throttled if consumption exceeds overall capacity unit target
  • 15.
    Stored Procedures,Triggers andUDFs • DocumentDB supports server-side JavaScript • Stored Procedures: • Registered at collection level • Operate on any document in the collection • Invoked inside transaction context on primary replica • Triggers: • Pre- or Post: create, replace or delete operations • Invoked inside transaction context on primary replica • User-Defined Functions • Scalar functions invoked only inside queries
  • 16.
    Libraries • .NETAPI • Node.js • JavaScript client • JavaScript server • Python
  • 17.
    Preview • AzureDocumentDB available in: • West US • North Europe • West Europe • Price: $0.73 /day, $22.50 / month – includes 50% preview discount
  • 18.
    Management • DocumentDBis supported only in the new portal • Manage database account, collections, users, etc. • View consumption statistics • https://portal.azure.com • API support to manage DocumentDB resources • Be aware of limits: • e.g., 3 collections per database account
  • 19.
  • 20.
    RESTful API •Core interface to DocumentDB • Used by all client libraries • Standard operations against all DocumentDB resources: • CREATE, DELETE, PUT, GET, POST • Returns permanent resource URL on creation • HMAC authentication using management or resource key • DocumentDB request headers
  • 21.
    Download • .NETAPI hosted on NuGet • Install-Package Microsoft.Azure.Documents.Client –Pre • Installs DocumentDB and JSON.NET packages
  • 22.
    Class: DocumentClient •Constructed with endpoint URL and management key for Database account • Provides async/await methods for CRUD operations on DocumentDB resources • Manages the connection to DocumentDB // Create DocumentClient String documentDbAddress = "https://{account}.documents.azure.com"; String authorizationKey = "key=="; Uri documentDbUri = new Uri(documentDbAddress); DocumentClient documentClient = new DocumentClient(documentDbUri, authorizationKey);
  • 23.
    Class: Resource •Base class for all DocumentDB resource classes • Exposes: • ETag - used for optimistic concurrency • SelfLink – URL path for resource • ResourceID – internal ID (base64 encoded) for resource • ID – ID of the resource, either provided or generated
  • 24.
    Class: Database •Derived from Resource • Adds properties exposing collections and users // Create database Database database = new Database { Id = databaseId }; ResourceResponse<Database> response = await documentClient.CreateDatabaseAsync(database); database = response; String selfLink = database.SelfLink; String collections = database.CollectionsLink; String users = database.UsersLink;
  • 25.
    Class: DocumentCollection •Derived from Resource • Adds properties exposing DocumentsLink, StoredProceduresLink, TriggersLink, UserDefinedFunctionsLink // Create document collection DocumentCollection documentCollection = new DocumentCollection { Id = "SomeId" }; ResourceResponse<DocumentCollection> response = await documentClient.CreateDocumentCollectionAsync( database.SelfLink, documentCollection); documentCollection = response;
  • 26.
    Data Model •Uses JSON.NET library for serialization • Simple class • No special base class • All public properties are serialized into JSON • Obvious mapping from.NET to JSON • IList, etc. -> Array • Int32, etc. -> Integer • Float, etc. -> Float • DateTime -> String • Byte[] -> String
  • 27.
    Class: Document •Derived from Resource • Adds property exposing AttachmentsLink // Insert document ResourceResponse<Document> response = await documentClient.CreateDocumentAsync( documentCollection.SelfLink, someDocumentEntity); Document document = response;
  • 28.
    Class: ResourceResponse<T> •Encapsulates the response from a DocumentDB resource operation • Provides resource-dependent quota and usage information • Contains the response headers including HTTP StatusCode • Implicitly exposes the typed resource from the response
  • 29.
    Read • ARead operation returns a single document. ResourceResponse<Document> response = await documentClient.ReadDocumentAsync(documentLink); Album album = JsonConvert.DeserializeObject<Album>(response.Resource.ToString());
  • 30.
    Delete Album album= new Album() { AlbumName = "Let It Bleed", BandName = "Rolling Stones", ReleaseYear = "1969“ }; Document document = await documentClient.CreateDocumentAsync( documentCollection.SelfLink, album); ResourceResponse<Document> secondResponse = await documentClient.DeleteDocumentAsync( document.SelfLink);
  • 31.
    Replace dynamic readResponse= await documentClient.ReadDocumentAsync(documentLink); RequestOptions requestOptions = new RequestOptions() { AccessCondition = new AccessCondition() { Type = AccessConditionType.IfMatch, Condition = readResponse.Resource.ETag } }; Album album = (Album)readResponse.Resource; album.ReleaseYear = "1990"; ResourceResponse<Document> replaceResponse = await documentClient.ReplaceDocumentAsync( documentLink, album, requestOptions);
  • 32.
    Read From aFeed • The .NET API can return all the resources in a collection as a paged “feed.” String continuation = String.Empty; Do { FeedOptions feedOptions = new FeedOptions { MaxItemCount = 10, RequestContinuation = continuation }; FeedResponse<dynamic> response = await documentClient.ReadDocumentFeedAsync( documentCollectionLink, feedOptions); continuation = response.ResponseContinuation; } while (!String.IsNullOrEmpty(continuation));
  • 33.
    DocumentDB Queries •DocumentDB supports queries at all resource levels, including: • Database, DocumentCollection, and Document • .NET API supports the following types of queries • SQL • LINQ SQL • LINQ Lambda • The DocumentQueryable class exposes helper extension methods to create various types of query
  • 34.
    SQL Query foreach(var album in documentClient.CreateDocumentQuery<Album>( documentCollection.SelfLink, "SELECT * FROM albums a WHERE a.bandName = 'Radiohead'")) { Console.WriteLine("Album name: {0}", album.AlbumName); } Note that albums is the name of the DocumentDB collection
  • 35.
    LINQ Query IQueryable<Album>albums = from a in documentClient.CreateDocumentQuery<Album>( documentCollection.SelfLink) where a.BandName == "Radiohead" select a; foreach (var album in albums) { Console.WriteLine("Album name: {0}", album.AlbumName) }
  • 36.
    LINQ LambaWith Paging FeedOptions feedOptions = new FeedOptions() { MaxItemCount = 10 }; var query = documentClient.CreateDocumentQuery<Album>( documentCollection.SelfLink, feedOptions) .Where(a => a.BandName == "Radiohead") .AsDocumentQuery(); do { foreach (Album album in await query.ExecuteNextAsync()) { Console.WriteLine("Album name: {0}", album.AlbumName); } } while (query.HasMoreResults);
  • 37.
    Summary • AzureDocumentDB Preview • Fully managed document database storing JSON entities • High scale and performance • Wide variety of client libraries • .NET, Node.js, JavaScript, python • Supported only in the new Azure portal
  • 38.
    Resources • Documentation: • http://documentdb.com • Azure Portal • http://portal.azure.com • Channel 9 Show on DocumentDB • http://channel9.msdn.com/Shows/Data-Exposed/Introduction-to-Azure-DocumentDB

Editor's Notes

  • #11 SQL reference: http://msdn.microsoft.com/en-us/library/azure/dn782250.aspx
  • #14 Preview Limits for DocumentDB: http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/
  • #15 http://azure.microsoft.com/en-us/documentation/articles/documentdb-manage/ http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/ Note that the x-ms-request-charge response header indicates the actual request units consumed by a given request
  • #18 Pricing page: http://azure.microsoft.com/en-us/pricing/details/documentdb/
  • #21 REST API documentation: http://msdn.microsoft.com/en-us/library/dn781481.aspx
  • #22 MSDN http://msdn.microsoft.com/en-us/library/azure/microsoft.azure.documents.client.documentclient.aspx
  • #23 MSDN http://msdn.microsoft.com/en-us/library/azure/microsoft.azure.documents.client.documentclient.aspx
  • #24 MSDN http://msdn.microsoft.com/en-us/library/azure/microsoft.azure.documents.client.documentclient.aspx
  • #27 JSON.NET documentation: http://james.newtonking.com/ Rename properties with JsonProperty [JsonProperty(PropertyName = "id")]