CouchDB: JSON,
HTTP & MapReduce
   Bradley Holt (http://bradley-holt.com/)
@BradleyHolt (http://twitter.com/BradleyHolt)
About Me
Co-Founder and
Technical Director
from Vermont
Organizer   BTV
Contributor
Author




         http://oreilly.com/catalog/9781449303129/   http://oreilly.com/catalog/9781449303433/
CouchDB Basics
Cluster Of Unreliable
Commodity Hardware
Document-oriented, schema-less

Shared nothing, horizontally scalable

Runs on the Erlang OTP platform

Peer-based, bi-directional replication

RESTful HTTP API

Queries are done against MapReduce “views”, or indexes
When You Might
Consider CouchDB
You’ve found yourself denormalizing your SQL database for better
performance.

Your domain model is a “ t” for documents (e.g. a CMS).

Your application is read-heavy.

You need a high level of concurrency and can give up consistency
in exchange.

You need horizontal scalability.

You want your database to be able to run anywhere, even on
mobile devices, and even when disconnected from the cluster.
Trade-Offs
No ad hoc queries. You need to know what you’re going to want to
query ahead of time. For example, SQL would be a better t for
business intelligence reporting.

No concept of “joins”. You can relate data, but watch out for
consistency issues.

Transactions are limited to document boundaries.

CouchDB trades storage space for performance.
Other Alternatives to SQL
MongoDB
http://www.mongodb.org/

Redis
http://redis.io/

Cassandra
http://cassandra.apache.org/

Riak
http://www.basho.com/

HBase (a database for Hadoop)
http://hbase.apache.org/
Don’t be so quick to get rid of SQL!
There are many problems for which an
SQL database is a good t. SQL is very
powerful and exible query language.
JSON Documents
{"title":"CouchDB: The De nitive Guide"}
JSON (JavaScript Object Notation) is a
human-readable and lightweight data
interchange format.

Data structures from many
programming languages can be
easily converted to and from JSON.
JSON Values
A JSON object is a collection of key/value pairs.

JSON values can be strings, numbers, booleans (false or true),
arrays (e.g. ["a", "b", "c"]), null, or another JSON object.
A “Book” JSON Object
{
    "_id":"978-0-596-15589-6",
    "title":"CouchDB: The De nitive Guide",
    "subtitle":"Time to Relax",
    "authors":[
       "J. Chris Anderson",
       "Jan Lehnardt",
       "Noah Slater"
    ],
    "publisher":"O'Reilly Media",
    "released":"2010-01-19",
    "pages":272
}
RESTful HTTP API
curl -iX PUT http://localhost:5984/mydb
Representational State Transfer (REST)
is a software architecture style that
describes distributed hypermedia
systems such as the World Wide Web.
HTTP is distributed,
scalable, and cacheable.
Everyone speaks HTTP.
Create a Database
$ curl -iX PUT http://localhost:5984/mydb

HTTP/1.1 201 Created
Location: http://localhost:5984/mydb
{"ok":true}
Create a Document
$ curl -iX POST http://localhost:5984/mydb
-H "Content-Type: application/json"
-d '{"_id":"42621b2516001626"}'

HTTP/1.1 201 Created
Location: http://localhost:5984/mydb/42621b2516001626

{
    "ok":true,
    "id":"42621b2516001626",
    "rev":"1-967a00dff5e02add41819138abb3284d"
}
Read a Document
$ curl -iX GET http://localhost:5984/mydb/42621b2516001626

HTTP/1.1 200 OK
Etag: "1-967a00dff5e02add41819138abb3284d"

{
    "_id":"42621b2516001626",
    "_rev":"1-967a00dff5e02add41819138abb3284d"
}
When updating a document, CouchDB
requires the correct document revision
number as part of its Multi-Version
Concurrency Control (MVCC). This form
of optimistic concurrency ensures that
another client hasn't modi ed the
document since it was last retrieved.
Update a Document
$ curl -iX PUT http://localhost:5984/mydb/42621b2516001626
-H "Content-Type: application/json"
-d '{
   "_id":"42621b2516001626",
   "_rev":"1-967a00dff5e02add41819138abb3284d",
   "title":"CouchDB: The De nitive Guide"
}'

HTTP/1.1 201 Created

{
    "ok":true,
    "id":"42621b2516001626",
    "rev":"2-bbd27429fd1a0daa2b946cbacb22dc3e"
}
Conditional GET
$ curl -iX GET http://localhost:5984/mydb/42621b2516001626
-H 'If-None-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"'

HTTP/1.1 304 Not Modified
Etag: "2-bbd27429fd1a0daa2b946cbacb22dc3e"
Content-Length: 0
Delete a Document
$ curl -iX DELETE http://localhost:5984/mydb/42621b2516001626
-H 'If-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"'

HTTP/1.1 200 OK

{
    "ok":true,
    "id":"42621b2516001626",
    "rev":"3-29d2ef6e0d3558a3547a92dac51f3231"
}
Read a Deleted Document
$ curl -iX GET http://localhost:5984/mydb/42621b2516001626

HTTP/1.1 404 Object Not Found
{
    "error":"not_found",
    "reason":"deleted"
}
Read a Deleted Document
$ curl -iX GET http://localhost:5984/mydb/42621b2516001626
?rev=3-29d2ef6e0d3558a3547a92dac51f3231

HTTP/1.1 200 OK
Etag: "3-29d2ef6e0d3558a3547a92dac51f3231"
{
    "_id":"42621b2516001626",
    "_rev":"3-29d2ef6e0d3558a3547a92dac51f3231",
    "_deleted":true
}
Fetch Revisions
$ curl -iX GET http://localhost:5984/mydb/42621b2516001626
?rev=3-29d2ef6e0d3558a3547a92dac51f3231&revs=true

HTTP/1.1 200 OK
{
    // …
    "_revisions":{
      "start":3,
      "ids":[
        "29d2ef6e0d3558a3547a92dac51f3231",
        "bbd27429fd1a0daa2b946cbacb22dc3e",
        "967a00dff5e02add41819138abb3284d"
      ]
    }
}
Do not rely on older revisions.
Revisions are used for concurrency
control and for replication. Old
revisions are removed during database
compaction.
MapReduce Views
function(doc) { if (doc.title) { emit(doc.title); } }
MapReduce consists of Map and
Reduce steps which can be distributed
in a way that takes advantage of the
multiple processor cores found in
modern hardware.
Futon
Create a Database
Map and Reduce are written as
JavaScript functions that are de ned
within views. Temporary views can be
used during development but should
be saved permanently to design
documents for production. Temporary
views can be very slow.
Map
Add the First Document
{
    "_id":"978-0-596-15589-6",
    "title":"CouchDB: The De nitive Guide",
    "subtitle":"Time to Relax",
    "authors":[
       "J. Chris Anderson",
       "Jan Lehnardt",
       "Noah Slater"
    ],
    "publisher":"O'Reilly Media",
    "released":"2010-01-19",
    "pages":272
}
One-To-One Mapping
Map Book Titles
function(doc) { // JSON object representing a doc to be mapped
  if (doc.title) { // make sure this doc has a title
     emit(doc.title); // emit the doc’s title as the key
  }
}
The emit function accepts two
arguments. The rst is a key and the
second a value. Both are optional and
default to null.
“Titles” Temporary View
“Titles” Temporary View

      key                  id            value
 "CouchDB: The
                   "978-0-596-15589-6"   null
Definitive Guide"
Add a Second Document
{
    "_id":"978-0-596-52926-0",
    "title":"RESTful Web Services",
    "subtitle":"Web services for the real world",
    "authors":[
       "Leonard Richardson",
       "Sam Ruby"
    ],
    "publisher":"O'Reilly Media",
    "released":"2007-05-08",
    "pages":448
}
“Titles” Temporary View
“Titles” Temporary View

      key                  id            value
 "CouchDB: The
                   "978-0-596-15589-6"   null
Definitive Guide"

 "RESTful Web
                   "978-0-596-52926-0"   null
   Services"
One-To-Many Mapping
Add a “formats” Field to
Both Documents
{
    // …
    "formats":[
      "Print",
      "Ebook",
      "Safari Books Online"
    ]
}
Add a Third Document
{
    "_id":"978-1-565-92580-9",
    "title":"DocBook: The De nitive Guide",
    "authors":[
       "Norman Walsh",
       "Leonard Muellner"
    ],
    "publisher":"O'Reilly Media",
    "formats":[
       "Print"
    ],
    "released":"1999-10-28",
    "pages":648
}
Map Book Formats
function(doc) { // JSON object representing a doc to be mapped
  if (doc.formats) { // make sure this doc has a formats eld
     for (var i in doc.formats) {
       emit(doc.formats[i]); // emit each format as the key
     }
  }
}
“Formats” Temporary View
        key                   id            value
      "Ebook"        "978-0-596-15589-6"    null
      "Ebook"        "978-0-596-52926-0"    null
      "Print"        "978-0-596-15589-6"    null
      "Print"        "978-0-596-52926-0"    null
      "Print"        "978-1-565-92580-9"    null
"Safari Books Online" "978-0-596-15589-6"   null
"Safari Books Online" "978-0-596-52926-0"   null
When querying a view, the “key” and
“id” elds can be used to select a row or
range of rows. Optionally, rows can be
grouped by “key” elds or by levels of
“key” elds.
Map Book Authors
function(doc) { // JSON object representing a doc to be mapped
  if (doc.authors) { // make sure this doc has an authors eld
     for (var i in doc.authors) {
       emit(doc.authors[i]); // emit each author as the key
     }
  }
}
“Authors” Temporary View
        key                    id            value
 "J. Chris Anderson"   "978-0-596-15589-6"   null
   "Jan Lehnardt"      "978-0-596-15589-6"   null
 "Leonard Muellner"    "978-1-565-92580-9"   null
"Leonard Richardson" "978-0-596-52926-0"     null
   "Noah Slater"       "978-0-596-15589-6"   null
  "Norman Walsh"       "978-1-565-92580-9"   null
    "Sam Ruby"         "978-0-596-52926-0"   null
Reduce
Built-in Reduce Functions
   Function                        Output

_count        Returns the number of mapped values in the set


_sum          Returns the sum of the set of mapped values

              Returns numerical statistics of the mapped values in
_stats
              the set including the sum, count, min, and max
Count
Format Count, Not Grouped
Format Count, Not Grouped


    key           value
    null           7
Format Count, Grouped
Format Count, Grouped

         key             value
       "Ebook"            2
        "Print"           3
 "Safari Books Online"    2
Sum
Sum of Pages by Format
Sum of Pages by Format

         key             value
       "Ebook"           720
        "Print"          1368
 "Safari Books Online"   720
Stats

sum, count, minimum, maximum,
sum over all square roots
Stats of Pages by Format
Stats of Pages by Format
        key                          value
                        {"sum":720,"count":2,"min":272,
      "Ebook"
                          "max":448,"sumsqr":274688}

                        {"sum":1368,"count":3,"min":272,
       "Print"
                          "max":648,"sumsqr":694592}

                        {"sum":720,"count":2,"min":272,
"Safari Books Online"
                          "max":448,"sumsqr":274688}
Custom Reduce Functions
The built-in Reduce functions should
serve your needs most, if not all, of the
time. If you nd yourself writing a
custom Reduce function, then take a
step back and make sure that one of
the built-in Reduce functions won't
serve your needs better.
Reduce Function Skeleton
function(keys, values, rereduce) {
}
Parameters
Keys: An array of mapped key and document IDs in the form of
[key,id] where id is the document ID.

Values: An array of mapped values.

Rereduce: Whether or not the Reduce function is being called
recursively on its own output.
Count Equivalent
function(keys, values, rereduce) {
  if (rereduce) {
     return sum(values);
  } else {
     return values.length;
  }
}
Sum Equivalent
function(keys, values, rereduce) {
  return sum(values);
}
MapReduce Limitations
Full-text indexing and ad hoc searching
  • couchdb-lucene
     https://github.com/rnewson/couchdb-lucene
  • ElasticSearch and CouchDB
   https://github.com/elasticsearch/elasticsearch/wiki/Couchdb-integration

Geospatial indexing and search (two dimensional)
  • GeoCouch
    https://github.com/vmx/couchdb
  • Geohash (e.g. dr5rusx1qkvvr)
    http://geohash.org/
Querying Views
You can query for all rows, a single
contiguous range of rows, or even rows
matching a speci ed key.
Add a Fourth Document
{
    "_id":"978-0-596-80579-1",
    "title":"Building iPhone Apps with HTML, CSS, and JavaScript",
    "subtitle":"Making App Store Apps Without Objective-C or Cocoa",
    "authors":[
       "Jonathan Stark"
    ],
    "publisher":"O'Reilly Media",
    "formats":[
       "Print",
       "Ebook",
       "Safari Books Online"
    ],
    "released":"2010-01-08",
    "pages":192
}
Map Book Releases
function(doc) {
  if (doc.released) {
     emit(doc.released.split("-"), doc.pages);
  }
}
Save the “Releases” View
“Releases”, Exact Grouping
“Releases”, Exact Grouping
       key                        value
                     {"sum":648,"count":1,"min":648,
["1999","10","28"]
                       "max":648,"sumsqr":419904}

                     {"sum":448,"count":1,"min":448,
["2007","05","08"]
                       "max":448,"sumsqr":200704}

                     {"sum":192,"count":1,"min":192,
["2010","01","08"]
                       "max":192,"sumsqr":36864}

                     {"sum":272,"count":1,"min":272,
["2010","01","19"]
                       "max":272,"sumsqr":73984}
“Releases”, Level 1 Grouping
“Releases”, Level 1 Grouping
    key                   value
             {"sum":648,"count":1,"min":648,
  ["1999"]
               "max":648,"sumsqr":419904}

             {"sum":448,"count":1,"min":448,
  ["2007"]
               "max":448,"sumsqr":200704}

             {"sum":464,"count":2,"min":192,
  ["2010"]
               "max":272,"sumsqr":110848}
“Releases”, Level 2 Grouping
“Releases”, Level 2 Grouping
     key                      value
                 {"sum":648,"count":1,"min":648,
 ["1999","10"]
                   "max":648,"sumsqr":419904}

                 {"sum":448,"count":1,"min":448,
 ["2007","05"]
                   "max":448,"sumsqr":200704}

                 {"sum":464,"count":2,"min":192,
 ["2010","01"]
                   "max":272,"sumsqr":110848}
PHP Libraries for CouchDB
PHPCouch
http://www.phpcouch.org/

PHPillow
http://arbitracker.org/phpillow.html

PHP CouchDB Extension (PECL)
http://www.topdog.za.net/php_couchdb_extension

Doctrine2 ODM
Do-It-Yourself
Zend_Http_Client + Zend_Cache

PEAR’s HTTP_Client

PHP cURL

JSON maps nicely to and from PHP arrays using Zend_Json or
PHP’s json_ functions
Scaling
Load Balancing
Send POST, PUT, and DELETE requests to a write-only master node

Setup continuous replication from the master node to multiple
read-only nodes

Load balance GET, HEAD, and OPTIONS requests amongst
read-only nodes
  • Apache HTTP Server (mod_proxy)
  • nginx
  • HAProxy
  • Varnish
  • Squid
Clustering
(Partitioning/Sharding)
Lounge
http://tilgovi.github.com/couchdb-lounge/
  • Proxy, partitioning, and sharding
BigCouch
https://github.com/cloudant/bigcouch
  • Clusters modeled after Amazon’s Dynamo approach
Pillow
https://github.com/khellan/Pillow
   • “…a combined router and rereducer for CouchDB.”
What else?
Authentication
Basic Authentication

Cookie Authentication

OAuth
Authorization
Server Admin

Database Reader

Document Update Validation
http://wiki.apache.org/couchdb/Document_Update_Validation
Hypermedia Controls
Show Functions

List Functions

See:
http://wiki.apache.org/couchdb/Formatting_with_Show_and_List
Replication
Peer-based & bi-directional

Documents and modi ed elds are incrementally replicated.

All data will be eventually consistent.

Con icts are agged and can be handled by application logic.

Partial replicas can be created via JavaScript lter functions.
Hosting
CouchOne (now CouchBase)
http://www.couchone.com/

Cloudant
https://cloudant.com/
Distributing
CouchApp
http://couchapp.org/
  • Applications built using JavaScript, HTML5, CSS and CouchDB
Mobile
 • Android
   http://www.couchone.com/android
CouchDB Resources
CouchDB: The De nitive Guide     CouchDB Wiki
by J. Chris Anderson, Jan        http://wiki.apache.org/couchdb/
Lehnardt, and Noah Slater
(O’Reilly)                       Beginning CouchDB
978-0-596-15589-6                by Joe Lennon (Apress)
                                 978-1-430-27237-3
Writing and Querying MapReduce
Views in CouchDB
by Bradley Holt (O’Reilly)
978-1-449-30312-9

Scaling CouchDB
by Bradley Holt (O’Reilly)
063-6-920-01840-7
Questions?
Thank You
                Bradley Holt (http://bradley-holt.com/)
             @BradleyHolt (http://twitter.com/BradleyHolt)




Copyright © 2011 Bradley Holt. All rights reserved.

CouchDB at New York PHP

  • 1.
    CouchDB: JSON, HTTP &MapReduce Bradley Holt (http://bradley-holt.com/) @BradleyHolt (http://twitter.com/BradleyHolt)
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    Author http://oreilly.com/catalog/9781449303129/ http://oreilly.com/catalog/9781449303433/
  • 8.
  • 9.
    Cluster Of Unreliable CommodityHardware Document-oriented, schema-less Shared nothing, horizontally scalable Runs on the Erlang OTP platform Peer-based, bi-directional replication RESTful HTTP API Queries are done against MapReduce “views”, or indexes
  • 10.
    When You Might ConsiderCouchDB You’ve found yourself denormalizing your SQL database for better performance. Your domain model is a “ t” for documents (e.g. a CMS). Your application is read-heavy. You need a high level of concurrency and can give up consistency in exchange. You need horizontal scalability. You want your database to be able to run anywhere, even on mobile devices, and even when disconnected from the cluster.
  • 11.
    Trade-Offs No ad hocqueries. You need to know what you’re going to want to query ahead of time. For example, SQL would be a better t for business intelligence reporting. No concept of “joins”. You can relate data, but watch out for consistency issues. Transactions are limited to document boundaries. CouchDB trades storage space for performance.
  • 12.
    Other Alternatives toSQL MongoDB http://www.mongodb.org/ Redis http://redis.io/ Cassandra http://cassandra.apache.org/ Riak http://www.basho.com/ HBase (a database for Hadoop) http://hbase.apache.org/
  • 13.
    Don’t be soquick to get rid of SQL! There are many problems for which an SQL database is a good t. SQL is very powerful and exible query language.
  • 14.
  • 15.
    JSON (JavaScript ObjectNotation) is a human-readable and lightweight data interchange format. Data structures from many programming languages can be easily converted to and from JSON.
  • 16.
    JSON Values A JSONobject is a collection of key/value pairs. JSON values can be strings, numbers, booleans (false or true), arrays (e.g. ["a", "b", "c"]), null, or another JSON object.
  • 17.
    A “Book” JSONObject { "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"O'Reilly Media", "released":"2010-01-19", "pages":272 }
  • 18.
    RESTful HTTP API curl-iX PUT http://localhost:5984/mydb
  • 19.
    Representational State Transfer(REST) is a software architecture style that describes distributed hypermedia systems such as the World Wide Web.
  • 20.
  • 21.
  • 22.
    Create a Database $curl -iX PUT http://localhost:5984/mydb HTTP/1.1 201 Created Location: http://localhost:5984/mydb {"ok":true}
  • 23.
    Create a Document $curl -iX POST http://localhost:5984/mydb -H "Content-Type: application/json" -d '{"_id":"42621b2516001626"}' HTTP/1.1 201 Created Location: http://localhost:5984/mydb/42621b2516001626 { "ok":true, "id":"42621b2516001626", "rev":"1-967a00dff5e02add41819138abb3284d" }
  • 24.
    Read a Document $curl -iX GET http://localhost:5984/mydb/42621b2516001626 HTTP/1.1 200 OK Etag: "1-967a00dff5e02add41819138abb3284d" { "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d" }
  • 25.
    When updating adocument, CouchDB requires the correct document revision number as part of its Multi-Version Concurrency Control (MVCC). This form of optimistic concurrency ensures that another client hasn't modi ed the document since it was last retrieved.
  • 26.
    Update a Document $curl -iX PUT http://localhost:5984/mydb/42621b2516001626 -H "Content-Type: application/json" -d '{ "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d", "title":"CouchDB: The De nitive Guide" }' HTTP/1.1 201 Created { "ok":true, "id":"42621b2516001626", "rev":"2-bbd27429fd1a0daa2b946cbacb22dc3e" }
  • 27.
    Conditional GET $ curl-iX GET http://localhost:5984/mydb/42621b2516001626 -H 'If-None-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"' HTTP/1.1 304 Not Modified Etag: "2-bbd27429fd1a0daa2b946cbacb22dc3e" Content-Length: 0
  • 28.
    Delete a Document $curl -iX DELETE http://localhost:5984/mydb/42621b2516001626 -H 'If-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"' HTTP/1.1 200 OK { "ok":true, "id":"42621b2516001626", "rev":"3-29d2ef6e0d3558a3547a92dac51f3231" }
  • 29.
    Read a DeletedDocument $ curl -iX GET http://localhost:5984/mydb/42621b2516001626 HTTP/1.1 404 Object Not Found { "error":"not_found", "reason":"deleted" }
  • 30.
    Read a DeletedDocument $ curl -iX GET http://localhost:5984/mydb/42621b2516001626 ?rev=3-29d2ef6e0d3558a3547a92dac51f3231 HTTP/1.1 200 OK Etag: "3-29d2ef6e0d3558a3547a92dac51f3231" { "_id":"42621b2516001626", "_rev":"3-29d2ef6e0d3558a3547a92dac51f3231", "_deleted":true }
  • 31.
    Fetch Revisions $ curl-iX GET http://localhost:5984/mydb/42621b2516001626 ?rev=3-29d2ef6e0d3558a3547a92dac51f3231&revs=true HTTP/1.1 200 OK { // … "_revisions":{ "start":3, "ids":[ "29d2ef6e0d3558a3547a92dac51f3231", "bbd27429fd1a0daa2b946cbacb22dc3e", "967a00dff5e02add41819138abb3284d" ] } }
  • 32.
    Do not relyon older revisions. Revisions are used for concurrency control and for replication. Old revisions are removed during database compaction.
  • 33.
    MapReduce Views function(doc) {if (doc.title) { emit(doc.title); } }
  • 34.
    MapReduce consists ofMap and Reduce steps which can be distributed in a way that takes advantage of the multiple processor cores found in modern hardware.
  • 35.
  • 36.
  • 37.
    Map and Reduceare written as JavaScript functions that are de ned within views. Temporary views can be used during development but should be saved permanently to design documents for production. Temporary views can be very slow.
  • 38.
  • 39.
    Add the FirstDocument { "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"O'Reilly Media", "released":"2010-01-19", "pages":272 }
  • 40.
  • 41.
    Map Book Titles function(doc){ // JSON object representing a doc to be mapped if (doc.title) { // make sure this doc has a title emit(doc.title); // emit the doc’s title as the key } }
  • 42.
    The emit functionaccepts two arguments. The rst is a key and the second a value. Both are optional and default to null.
  • 43.
  • 44.
    “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" null Definitive Guide"
  • 45.
    Add a SecondDocument { "_id":"978-0-596-52926-0", "title":"RESTful Web Services", "subtitle":"Web services for the real world", "authors":[ "Leonard Richardson", "Sam Ruby" ], "publisher":"O'Reilly Media", "released":"2007-05-08", "pages":448 }
  • 46.
  • 47.
    “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" null Definitive Guide" "RESTful Web "978-0-596-52926-0" null Services"
  • 48.
  • 49.
    Add a “formats”Field to Both Documents { // … "formats":[ "Print", "Ebook", "Safari Books Online" ] }
  • 50.
    Add a ThirdDocument { "_id":"978-1-565-92580-9", "title":"DocBook: The De nitive Guide", "authors":[ "Norman Walsh", "Leonard Muellner" ], "publisher":"O'Reilly Media", "formats":[ "Print" ], "released":"1999-10-28", "pages":648 }
  • 51.
    Map Book Formats function(doc){ // JSON object representing a doc to be mapped if (doc.formats) { // make sure this doc has a formats eld for (var i in doc.formats) { emit(doc.formats[i]); // emit each format as the key } } }
  • 53.
    “Formats” Temporary View key id value "Ebook" "978-0-596-15589-6" null "Ebook" "978-0-596-52926-0" null "Print" "978-0-596-15589-6" null "Print" "978-0-596-52926-0" null "Print" "978-1-565-92580-9" null "Safari Books Online" "978-0-596-15589-6" null "Safari Books Online" "978-0-596-52926-0" null
  • 54.
    When querying aview, the “key” and “id” elds can be used to select a row or range of rows. Optionally, rows can be grouped by “key” elds or by levels of “key” elds.
  • 55.
    Map Book Authors function(doc){ // JSON object representing a doc to be mapped if (doc.authors) { // make sure this doc has an authors eld for (var i in doc.authors) { emit(doc.authors[i]); // emit each author as the key } } }
  • 57.
    “Authors” Temporary View key id value "J. Chris Anderson" "978-0-596-15589-6" null "Jan Lehnardt" "978-0-596-15589-6" null "Leonard Muellner" "978-1-565-92580-9" null "Leonard Richardson" "978-0-596-52926-0" null "Noah Slater" "978-0-596-15589-6" null "Norman Walsh" "978-1-565-92580-9" null "Sam Ruby" "978-0-596-52926-0" null
  • 58.
  • 59.
    Built-in Reduce Functions Function Output _count Returns the number of mapped values in the set _sum Returns the sum of the set of mapped values Returns numerical statistics of the mapped values in _stats the set including the sum, count, min, and max
  • 60.
  • 61.
  • 62.
    Format Count, NotGrouped key value null 7
  • 63.
  • 64.
    Format Count, Grouped key value "Ebook" 2 "Print" 3 "Safari Books Online" 2
  • 65.
  • 66.
    Sum of Pagesby Format
  • 67.
    Sum of Pagesby Format key value "Ebook" 720 "Print" 1368 "Safari Books Online" 720
  • 68.
    Stats sum, count, minimum,maximum, sum over all square roots
  • 69.
    Stats of Pagesby Format
  • 70.
    Stats of Pagesby Format key value {"sum":720,"count":2,"min":272, "Ebook" "max":448,"sumsqr":274688} {"sum":1368,"count":3,"min":272, "Print" "max":648,"sumsqr":694592} {"sum":720,"count":2,"min":272, "Safari Books Online" "max":448,"sumsqr":274688}
  • 71.
  • 72.
    The built-in Reducefunctions should serve your needs most, if not all, of the time. If you nd yourself writing a custom Reduce function, then take a step back and make sure that one of the built-in Reduce functions won't serve your needs better.
  • 73.
  • 74.
    Parameters Keys: An arrayof mapped key and document IDs in the form of [key,id] where id is the document ID. Values: An array of mapped values. Rereduce: Whether or not the Reduce function is being called recursively on its own output.
  • 75.
    Count Equivalent function(keys, values,rereduce) { if (rereduce) { return sum(values); } else { return values.length; } }
  • 76.
    Sum Equivalent function(keys, values,rereduce) { return sum(values); }
  • 77.
    MapReduce Limitations Full-text indexingand ad hoc searching • couchdb-lucene https://github.com/rnewson/couchdb-lucene • ElasticSearch and CouchDB https://github.com/elasticsearch/elasticsearch/wiki/Couchdb-integration Geospatial indexing and search (two dimensional) • GeoCouch https://github.com/vmx/couchdb • Geohash (e.g. dr5rusx1qkvvr) http://geohash.org/
  • 78.
  • 79.
    You can queryfor all rows, a single contiguous range of rows, or even rows matching a speci ed key.
  • 80.
    Add a FourthDocument { "_id":"978-0-596-80579-1", "title":"Building iPhone Apps with HTML, CSS, and JavaScript", "subtitle":"Making App Store Apps Without Objective-C or Cocoa", "authors":[ "Jonathan Stark" ], "publisher":"O'Reilly Media", "formats":[ "Print", "Ebook", "Safari Books Online" ], "released":"2010-01-08", "pages":192 }
  • 81.
    Map Book Releases function(doc){ if (doc.released) { emit(doc.released.split("-"), doc.pages); } }
  • 82.
  • 83.
  • 84.
    “Releases”, Exact Grouping key value {"sum":648,"count":1,"min":648, ["1999","10","28"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007","05","08"] "max":448,"sumsqr":200704} {"sum":192,"count":1,"min":192, ["2010","01","08"] "max":192,"sumsqr":36864} {"sum":272,"count":1,"min":272, ["2010","01","19"] "max":272,"sumsqr":73984}
  • 85.
  • 86.
    “Releases”, Level 1Grouping key value {"sum":648,"count":1,"min":648, ["1999"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010"] "max":272,"sumsqr":110848}
  • 87.
  • 88.
    “Releases”, Level 2Grouping key value {"sum":648,"count":1,"min":648, ["1999","10"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007","05"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010","01"] "max":272,"sumsqr":110848}
  • 89.
    PHP Libraries forCouchDB PHPCouch http://www.phpcouch.org/ PHPillow http://arbitracker.org/phpillow.html PHP CouchDB Extension (PECL) http://www.topdog.za.net/php_couchdb_extension Doctrine2 ODM
  • 90.
    Do-It-Yourself Zend_Http_Client + Zend_Cache PEAR’sHTTP_Client PHP cURL JSON maps nicely to and from PHP arrays using Zend_Json or PHP’s json_ functions
  • 91.
  • 92.
    Load Balancing Send POST,PUT, and DELETE requests to a write-only master node Setup continuous replication from the master node to multiple read-only nodes Load balance GET, HEAD, and OPTIONS requests amongst read-only nodes • Apache HTTP Server (mod_proxy) • nginx • HAProxy • Varnish • Squid
  • 93.
    Clustering (Partitioning/Sharding) Lounge http://tilgovi.github.com/couchdb-lounge/ •Proxy, partitioning, and sharding BigCouch https://github.com/cloudant/bigcouch • Clusters modeled after Amazon’s Dynamo approach Pillow https://github.com/khellan/Pillow • “…a combined router and rereducer for CouchDB.”
  • 94.
  • 95.
  • 96.
    Authorization Server Admin Database Reader DocumentUpdate Validation http://wiki.apache.org/couchdb/Document_Update_Validation
  • 97.
    Hypermedia Controls Show Functions ListFunctions See: http://wiki.apache.org/couchdb/Formatting_with_Show_and_List
  • 98.
    Replication Peer-based & bi-directional Documentsand modi ed elds are incrementally replicated. All data will be eventually consistent. Con icts are agged and can be handled by application logic. Partial replicas can be created via JavaScript lter functions.
  • 99.
  • 100.
    Distributing CouchApp http://couchapp.org/ •Applications built using JavaScript, HTML5, CSS and CouchDB Mobile • Android http://www.couchone.com/android
  • 101.
    CouchDB Resources CouchDB: TheDe nitive Guide CouchDB Wiki by J. Chris Anderson, Jan http://wiki.apache.org/couchdb/ Lehnardt, and Noah Slater (O’Reilly) Beginning CouchDB 978-0-596-15589-6 by Joe Lennon (Apress) 978-1-430-27237-3 Writing and Querying MapReduce Views in CouchDB by Bradley Holt (O’Reilly) 978-1-449-30312-9 Scaling CouchDB by Bradley Holt (O’Reilly) 063-6-920-01840-7
  • 102.
  • 103.
    Thank You Bradley Holt (http://bradley-holt.com/) @BradleyHolt (http://twitter.com/BradleyHolt) Copyright © 2011 Bradley Holt. All rights reserved.