SlideShare a Scribd company logo
Hybrid MongoDB
  Applications
  with Relational Databases
Today’s Agenda
•Who I am
•Why MongoDB w/intro
•Why Hybrid
•Hybrid Case Studies
•How OpenSky implemented Hybrid
 MySQL / MongoDB
My name is
Steve Francia

     @spf13
•15+ years building the
  internet (13 years using SQL)

•Father, husband,
  skateboarder

•Chief Solutions Architect @
  10gen responsible for
  drivers, integrations, web &
  docs
• Company behind MongoDB
 • AGPL license, own copyrights, engineering team
 • support, consulting, commercial license revenue
• Management
 • Google/DoubleClick, Oracle, Apple, NetApp
 • Funding: Sequoia, Union Square, Flybridge
 • Offices in NYC, Palo Alto, London & Dublin
 • 90+ employees
Before 10gen I
worked
    for

     http://opensky.com
OpenSky was the first
e-commerce site built
    on MongoDB
Why MongoDB
Why MongoDB
                My Top 10 Reasons

 10. Great developer experience
  9. Speaks your language
  8. Scale horizontally
  7. Fully consistent data w/atomic operations

1.It’ssource scale
         web
  6. Memory Caching integrated
 5. Open
  4. Flexible, rich & structured data format not just K:V
  3. Ludicrously fast (without going plaid)
  2. Simplify infrastructure & application
Why MongoDB
                My Top 10 Reasons

 10. Great developer experience
  9. Speaks your language
  8. Scale horizontally
  7. Fully consistent data w/atomic operations

1.It’ssource scale
         web
  6. Memory Caching integrated
 5. Open
  4. Flexible, rich & structured data format not just K:V
  3. Ludicrously fast (without going plaid)
  2. Simplify infrastructure & application
MongoDB is
          Application      Document
                           Oriented
                           { author: “steve”,
    High                     date: new Date(),
                             text: “About MongoDB...”,
Performance                  tags: [“tech”, “database”]}




                             Fully
                           Consistent

   Horizontally Scalable
Under the hood

• Written in C++
• Runs on nearly anything
• Data serialized to BSON
• Extensive use of memory-mapped files
  i.e. read-through write-through memory
  caching.
Database Landscape
                            MemCache
Scalability & Performance




                                                        MongoDB




                                                                RDBMS




                                       Depth of Functionality
This has led
    some to say

“
MongoDB has the best
features of key/values
stores, document databases
and relational databases in
one.
               John Nunemaker
Why Hybrid?
Reasons to build a
  hybrid application
•Friction in existing application caused
  by RDBMS
•Transitioning an existing application to
  MongoDB
•Using the right tool for the right job
•Need some features not present in
  MongoDB
Reasons Not to build
a hybrid application
•Aggregation (at least not very soon)
•Lack of clear understanding of needs
•Backups
•MongoDB as cache in front of SQL
•Loads more...
Hybrid
Applications...
  but I don’t
   want to
 complicate
    things
Most
  RDMBS
applications
are already
  hybrid
Typical RDMBS
  Application

        Memcache


 App


         RDBMS
Typical Hybrid
RDMBS Application

          MongoDB


   App


           RDBMS
Most of the same
     rules apply

•Application partitions data between
  two (or more) systems.
•Model layer tracks what content
  resides where.
Hybrid is easier than
RDMBS + MemCache
• Always know where to find a piece of data.
• Data never needs expiring.
• Data not duplicated (for the most part)
  across systems.
• Always handle a record same way.
• Developer freedom to choose the right tool
  for the right reasons.
Typical RDBMS
retrieval operation
       exists & up to date?
        if yes... then done     Memcache

       if no, query DB for it
        Retrieve record(s)       RDBMS
 App
         Replace in cache
                                Memcache
             Repeat
Typical Hybrid
Retrieval Operation
         find
        return   MongoDB


  App   OR
        query
        return   RDBMS
Typical RDMBS
write operation
       insert or update row
         confirm written         RDBMS

      assemble into object(s)
App        write object


                                Memcache
Typical RDMBS
write operation
         insert or update row
            confirm written              RDBMS

        assemble into object(s)
App           write object
              write object
              write object            Memcache
              write object
      This goes on for a while doesn’t it?
Typical RDMBS
   write operation
               insert or update row
                  confirm written              RDBMS

              assemble into object(s)
    App             write object
                    write object
                    write object            Memcache
                    write object
            This goes on for a while doesn’t it?

 one row can be in many objects so there’s
a lot of complication in updating everything
Typical Hybrid
Write Operation
      save document
           return        MongoDB


App        OR
      insert or update
           return        RDBMS
Typical Hybrid
Write Operation
      save document
           return        MongoDB


App        OR
      insert or update
           return        RDBMS
Hybrid Use Cases
Archiving
Why Hybrid:
• Existing application built on MySQL
• Lots of friction with RDBMS based archive storage
• Needed more scalable archive storage backend
Solution:
• Keep MySQL for active data (100mil), MongoDB for archive (2+
  bil)
Results:
• No more alter table statements taking over 2 months to run
• Sharding fixed vertical scale problem
• Very happily looking at other places to use MongoDB
Reporting
Why Hybrid:
• Most of the functionality written in MongoDB
• Reporting team doesn’t want to learn MongoDB


Solution:
• Use MongoDB for active database, replicate to MySQL for
  reporting

Results:
• Developers happy
• Business Analysts happy
E-commerce
Why Hybrid:
• Multi-vertical product catalogue impossible to model in RDBMS
• Needed transaction support RDBMS provides

Solution:
• MySQL for orders, MongoDB for everything else

Results:
•   Massive simplification of code base
•   Rapidly build, halving time to market (and cost)
•   Eliminated need for external caching system
•   50x+ improvement over MySQL alone
How

        implemented a
      hybrid MongoDB /
      MySQL solution
       http://opensky.com
Doctrine (ORM/ODM)
   makes it easy
Data to store in SQL

•Order
•Order/Shipment
•Order/Transaction
•Inventory
Data to store in
  MongoDB
Data to store in
        MongoDB
• User               • Event
• Product            • TaxRate
• Product/Sellable   • ... and then I got
                       tired of typing them in
• Address
                     • Just imagine this list
• Cart                 has 40 more classes

• CreditCard         • ...
The most boring SQL
    schema ever
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Did you notice
Inventory is in SQL
But it’s also property in your Mongo collections?
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Inventory is
 transient
Inventory is
         transient
• Product::$inventory is effectively a
  transient property
•Note how I said “effectively”? ... we
  cheat and persist our transient
  property to MongoDB as well
•We can do this because we never really
  trust the value stored in Mongo
Accuracy is only important
 when there’s contention
Accuracy is only important
 when there’s contention
•For display, sorting and alerts, we can
  use the value stashed in MongoDB
 •It’s faster
 •It’s accurate enough
Accuracy is only important
 when there’s contention
•For display, sorting and alerts, we can
  use the value stashed in MongoDB
 •It’s faster
 •It’s accurate enough
•For financial transactions, we want the
  multi table transactions from our
  RDBMS.
Inventory kept in
sync with listeners
Inventory kept in
 sync with listeners
•Every time a new product is created,
  its inventory is inserted in SQL
Inventory kept in
 sync with listeners
•Every time a new product is created,
  its inventory is inserted in SQL
•Every time an order is placed,
  inventory is verified and decremented
Inventory kept in
 sync with listeners
•Every time a new product is created,
  its inventory is inserted in SQL
•Every time an order is placed,
  inventory is verified and decremented
•Whenever the SQL inventory changes,
  it is saved to MongoDB as well
Be careful what you
        lock
Be careful what you
         lock
1. Acquire inventory row lock and begin
   transaction
2. Check current product inventory
3. Decrement product inventory
4. Write the Order to SQL
5. Update affected MongoDB documents
6. Commit the transaction
7. Release product inventory lock
Making MongoDB
and RDBMS relations
     play nice
Products are
documents stored
  in MongoDB
/** @mongodb:Document(collection="products") */
class Product
{
    /** @mongodb:Id */
    private $id;

    /** @mongodb:String */
    private $title;

    public function getId()
    {
        return $this->id;
    }

    public function getTitle()
    {
        return $this->title;
    }

    public function setTitle($title)
    {
        $this->title = $title;
    }
}
Orders are entities
stored in an RDBMS
/**
  * @orm:Entity
  * @orm:Table(name="orders")
  * @orm:HasLifecycleCallbacks
  */
class Order
{
     /**
      * @orm:Id @orm:Column(type="integer")
      * @orm:GeneratedValue(strategy="AUTO")
      */
     private $id;

    /**
     * @orm:Column(type="string")
     */
    private $productId;

    /**
     * @var DocumentsProduct
     */
    private $product;

    // ...
}
So how does an
    RDBMS have a
reference to something
 outside the database?
Setting the Product
class Order {

    // ...

    public function setProduct(Product $product)
    {
        $this->productId = $product->getId();
        $this->product = $product;
    }
}
• $productId is mapped and persisted
• $product which stores the Product
  instance is not a persistent entity
  property
Retrieving our
product later
OrderPostLoadListener
use DoctrineORMEventLifecycleEventArgs;

class OrderPostLoadListener
{
    public function postLoad(LifecycleEventArgs $eventArgs)
    {
        // get the order entity
        $order = $eventArgs->getEntity();

        // get odm reference to order.product_id
        $productId = $order->getProductId();
        $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId);

        // set the product on the order
        $em = $eventArgs->getEntityManager();
        $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder')
            ->reflClass->getProperty('product');
        $productReflProp->setAccessible(true);
        $productReflProp->setValue($order, $product);
    }
}
All Together Now
// Create a new product and order
$product = new Product();
$product->setTitle('Test Product');
$dm->persist($product);
$dm->flush();

$order = new Order();
$order->setProduct($product);
$em->persist($order);
$em->flush();

// Find the order later
$order = $em->find('Order', $order->getId());

// Instance of an uninitialized product proxy
$product = $order->getProduct();

// Initializes proxy and queries the monogodb database
echo "Order Title: " . $product->getTitle();
print_r($order);
Read more about
      this technique
Jon Wage, one of OpenSky’s engineers, first
wrote about this technique on his personal
blog: http://jwage.com

You can read the full article here:
http://jwage.com/2010/08/25/blending-the-
doctrine-orm-and-mongodb-odm/
http://spf13.com
                                http://github.com/spf13
                                @spf13




           Questions?
        download at mongodb.org
PS: We’re hiring!! Contact us at jobs@10gen.com
Hybrid MongoDB and RDBMS Applications

More Related Content

What's hot (20)

PPTX
C#の書き方
信之 岩永
 
ODP
Presto
Knoldus Inc.
 
PDF
Apache Spark Performance tuning and Best Practise
Knoldus Inc.
 
PDF
[NDC2017 : 박준철] Python 게임 서버 안녕하십니까 - 몬스터 슈퍼리그 게임 서버
준철 박
 
PPTX
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
PDF
Intro To MongoDB
Alex Sharp
 
PPTX
Optimizing Apache Spark SQL Joins
Databricks
 
PDF
MongoDB performance
Mydbops
 
PPTX
RocksDB detail
MIJIN AN
 
PDF
PostgreSQL Replication High Availability Methods
Mydbops
 
PDF
MariaDB Server Performance Tuning & Optimization
MariaDB plc
 
PDF
MongoDB vs. Postgres Benchmarks
EDB
 
PDF
MongoDB- Crud Operation
Edureka!
 
PDF
Redis
DaeMyung Kang
 
PDF
송창규, unity build로 빌드타임 반토막내기, NDC2010
devCAT Studio, NEXON
 
PDF
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
 
PPTX
Introduction to Redis
Arnab Mitra
 
PPTX
Next-generation MMORPG service architecture
Jongwon Kim
 
ODP
Introduction to MongoDB
Dineesha Suraweera
 
PDF
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Spark Summit
 
C#の書き方
信之 岩永
 
Presto
Knoldus Inc.
 
Apache Spark Performance tuning and Best Practise
Knoldus Inc.
 
[NDC2017 : 박준철] Python 게임 서버 안녕하십니까 - 몬스터 슈퍼리그 게임 서버
준철 박
 
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Intro To MongoDB
Alex Sharp
 
Optimizing Apache Spark SQL Joins
Databricks
 
MongoDB performance
Mydbops
 
RocksDB detail
MIJIN AN
 
PostgreSQL Replication High Availability Methods
Mydbops
 
MariaDB Server Performance Tuning & Optimization
MariaDB plc
 
MongoDB vs. Postgres Benchmarks
EDB
 
MongoDB- Crud Operation
Edureka!
 
송창규, unity build로 빌드타임 반토막내기, NDC2010
devCAT Studio, NEXON
 
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
 
Introduction to Redis
Arnab Mitra
 
Next-generation MMORPG service architecture
Jongwon Kim
 
Introduction to MongoDB
Dineesha Suraweera
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Spark Summit
 

Similar to Hybrid MongoDB and RDBMS Applications (20)

PDF
Mongodb
Apurva Vyas
 
PDF
MongoDB in FS
MongoDB
 
KEY
Discover MongoDB - Israel
Michael Fiedler
 
PPT
Mongo DB at Community Engine
Community Engine
 
PPT
MongoDB at community engine
mathraq
 
PPTX
When to Use MongoDB
MongoDB
 
PPTX
MongoDB.pptx
Sigit52
 
KEY
DynamoDB Gluecon 2012
Appirio
 
ZIP
Gluecon 2012 - DynamoDB
Jeff Douglas
 
PPTX
NoSQL
dbulic
 
PDF
NoSql presentation
Mat Wall
 
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
PDF
No SQL at The Guardian
Mat Wall
 
KEY
Why we chose mongodb for guardian.co.uk
Graham Tackley
 
PPTX
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Precisely
 
PDF
NoSQL into E-Commerce: lessons learned
La FeWeb
 
PPTX
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
PPTX
A Presentation on MongoDB Introduction - Habilelabs
HabileLabs
 
PPTX
MongoDB presentation
Hyphen Call
 
PPTX
Webinar: Migrating from RDBMS to MongoDB
MongoDB
 
Mongodb
Apurva Vyas
 
MongoDB in FS
MongoDB
 
Discover MongoDB - Israel
Michael Fiedler
 
Mongo DB at Community Engine
Community Engine
 
MongoDB at community engine
mathraq
 
When to Use MongoDB
MongoDB
 
MongoDB.pptx
Sigit52
 
DynamoDB Gluecon 2012
Appirio
 
Gluecon 2012 - DynamoDB
Jeff Douglas
 
NoSQL
dbulic
 
NoSql presentation
Mat Wall
 
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
No SQL at The Guardian
Mat Wall
 
Why we chose mongodb for guardian.co.uk
Graham Tackley
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Precisely
 
NoSQL into E-Commerce: lessons learned
La FeWeb
 
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
A Presentation on MongoDB Introduction - Habilelabs
HabileLabs
 
MongoDB presentation
Hyphen Call
 
Webinar: Migrating from RDBMS to MongoDB
MongoDB
 
Ad

More from Steven Francia (20)

PDF
State of the Gopher Nation - Golang - August 2017
Steven Francia
 
PDF
Building Awesome CLI apps in Go
Steven Francia
 
PDF
The Future of the Operating System - Keynote LinuxCon 2015
Steven Francia
 
PDF
7 Common Mistakes in Go (2015)
Steven Francia
 
PDF
What every successful open source project needs
Steven Francia
 
PDF
7 Common mistakes in Go and when to avoid them
Steven Francia
 
PDF
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Steven Francia
 
PDF
Painless Data Storage with MongoDB & Go
Steven Francia
 
PDF
Getting Started with Go
Steven Francia
 
PDF
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Steven Francia
 
PDF
Modern Database Systems (for Genealogy)
Steven Francia
 
PPTX
Introduction to MongoDB and Hadoop
Steven Francia
 
PPTX
Future of data
Steven Francia
 
PDF
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
KEY
Big data for the rest of us
Steven Francia
 
KEY
OSCON 2012 MongoDB Tutorial
Steven Francia
 
KEY
Replication, Durability, and Disaster Recovery
Steven Francia
 
KEY
Multi Data Center Strategies
Steven Francia
 
KEY
NoSQL databases and managing big data
Steven Francia
 
KEY
MongoDB, Hadoop and Humongous Data
Steven Francia
 
State of the Gopher Nation - Golang - August 2017
Steven Francia
 
Building Awesome CLI apps in Go
Steven Francia
 
The Future of the Operating System - Keynote LinuxCon 2015
Steven Francia
 
7 Common Mistakes in Go (2015)
Steven Francia
 
What every successful open source project needs
Steven Francia
 
7 Common mistakes in Go and when to avoid them
Steven Francia
 
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Steven Francia
 
Painless Data Storage with MongoDB & Go
Steven Francia
 
Getting Started with Go
Steven Francia
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Steven Francia
 
Modern Database Systems (for Genealogy)
Steven Francia
 
Introduction to MongoDB and Hadoop
Steven Francia
 
Future of data
Steven Francia
 
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
Big data for the rest of us
Steven Francia
 
OSCON 2012 MongoDB Tutorial
Steven Francia
 
Replication, Durability, and Disaster Recovery
Steven Francia
 
Multi Data Center Strategies
Steven Francia
 
NoSQL databases and managing big data
Steven Francia
 
MongoDB, Hadoop and Humongous Data
Steven Francia
 
Ad

Recently uploaded (20)

PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 

Hybrid MongoDB and RDBMS Applications

  • 1. Hybrid MongoDB Applications with Relational Databases
  • 2. Today’s Agenda •Who I am •Why MongoDB w/intro •Why Hybrid •Hybrid Case Studies •How OpenSky implemented Hybrid MySQL / MongoDB
  • 3. My name is Steve Francia @spf13
  • 4. •15+ years building the internet (13 years using SQL) •Father, husband, skateboarder •Chief Solutions Architect @ 10gen responsible for drivers, integrations, web & docs
  • 5. • Company behind MongoDB • AGPL license, own copyrights, engineering team • support, consulting, commercial license revenue • Management • Google/DoubleClick, Oracle, Apple, NetApp • Funding: Sequoia, Union Square, Flybridge • Offices in NYC, Palo Alto, London & Dublin • 90+ employees
  • 6. Before 10gen I worked for http://opensky.com
  • 7. OpenSky was the first e-commerce site built on MongoDB
  • 9. Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations 1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
  • 10. Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations 1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
  • 11. MongoDB is Application Document Oriented { author: “steve”, High date: new Date(), text: “About MongoDB...”, Performance tags: [“tech”, “database”]} Fully Consistent Horizontally Scalable
  • 12. Under the hood • Written in C++ • Runs on nearly anything • Data serialized to BSON • Extensive use of memory-mapped files i.e. read-through write-through memory caching.
  • 13. Database Landscape MemCache Scalability & Performance MongoDB RDBMS Depth of Functionality
  • 14. This has led some to say “ MongoDB has the best features of key/values stores, document databases and relational databases in one. John Nunemaker
  • 16. Reasons to build a hybrid application •Friction in existing application caused by RDBMS •Transitioning an existing application to MongoDB •Using the right tool for the right job •Need some features not present in MongoDB
  • 17. Reasons Not to build a hybrid application •Aggregation (at least not very soon) •Lack of clear understanding of needs •Backups •MongoDB as cache in front of SQL •Loads more...
  • 18. Hybrid Applications... but I don’t want to complicate things
  • 19. Most RDMBS applications are already hybrid
  • 20. Typical RDMBS Application Memcache App RDBMS
  • 21. Typical Hybrid RDMBS Application MongoDB App RDBMS
  • 22. Most of the same rules apply •Application partitions data between two (or more) systems. •Model layer tracks what content resides where.
  • 23. Hybrid is easier than RDMBS + MemCache • Always know where to find a piece of data. • Data never needs expiring. • Data not duplicated (for the most part) across systems. • Always handle a record same way. • Developer freedom to choose the right tool for the right reasons.
  • 24. Typical RDBMS retrieval operation exists & up to date? if yes... then done Memcache if no, query DB for it Retrieve record(s) RDBMS App Replace in cache Memcache Repeat
  • 25. Typical Hybrid Retrieval Operation find return MongoDB App OR query return RDBMS
  • 26. Typical RDMBS write operation insert or update row confirm written RDBMS assemble into object(s) App write object Memcache
  • 27. Typical RDMBS write operation insert or update row confirm written RDBMS assemble into object(s) App write object write object write object Memcache write object This goes on for a while doesn’t it?
  • 28. Typical RDMBS write operation insert or update row confirm written RDBMS assemble into object(s) App write object write object write object Memcache write object This goes on for a while doesn’t it? one row can be in many objects so there’s a lot of complication in updating everything
  • 29. Typical Hybrid Write Operation save document return MongoDB App OR insert or update return RDBMS
  • 30. Typical Hybrid Write Operation save document return MongoDB App OR insert or update return RDBMS
  • 32. Archiving Why Hybrid: • Existing application built on MySQL • Lots of friction with RDBMS based archive storage • Needed more scalable archive storage backend Solution: • Keep MySQL for active data (100mil), MongoDB for archive (2+ bil) Results: • No more alter table statements taking over 2 months to run • Sharding fixed vertical scale problem • Very happily looking at other places to use MongoDB
  • 33. Reporting Why Hybrid: • Most of the functionality written in MongoDB • Reporting team doesn’t want to learn MongoDB Solution: • Use MongoDB for active database, replicate to MySQL for reporting Results: • Developers happy • Business Analysts happy
  • 34. E-commerce Why Hybrid: • Multi-vertical product catalogue impossible to model in RDBMS • Needed transaction support RDBMS provides Solution: • MySQL for orders, MongoDB for everything else Results: • Massive simplification of code base • Rapidly build, halving time to market (and cost) • Eliminated need for external caching system • 50x+ improvement over MySQL alone
  • 35. How implemented a hybrid MongoDB / MySQL solution http://opensky.com
  • 36. Doctrine (ORM/ODM) makes it easy
  • 37. Data to store in SQL •Order •Order/Shipment •Order/Transaction •Inventory
  • 38. Data to store in MongoDB
  • 39. Data to store in MongoDB • User • Event • Product • TaxRate • Product/Sellable • ... and then I got tired of typing them in • Address • Just imagine this list • Cart has 40 more classes • CreditCard • ...
  • 40. The most boring SQL schema ever
  • 41. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 42. Did you notice Inventory is in SQL But it’s also property in your Mongo collections?
  • 43. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 45. Inventory is transient • Product::$inventory is effectively a transient property •Note how I said “effectively”? ... we cheat and persist our transient property to MongoDB as well •We can do this because we never really trust the value stored in Mongo
  • 46. Accuracy is only important when there’s contention
  • 47. Accuracy is only important when there’s contention •For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough
  • 48. Accuracy is only important when there’s contention •For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough •For financial transactions, we want the multi table transactions from our RDBMS.
  • 49. Inventory kept in sync with listeners
  • 50. Inventory kept in sync with listeners •Every time a new product is created, its inventory is inserted in SQL
  • 51. Inventory kept in sync with listeners •Every time a new product is created, its inventory is inserted in SQL •Every time an order is placed, inventory is verified and decremented
  • 52. Inventory kept in sync with listeners •Every time a new product is created, its inventory is inserted in SQL •Every time an order is placed, inventory is verified and decremented •Whenever the SQL inventory changes, it is saved to MongoDB as well
  • 53. Be careful what you lock
  • 54. Be careful what you lock 1. Acquire inventory row lock and begin transaction 2. Check current product inventory 3. Decrement product inventory 4. Write the Order to SQL 5. Update affected MongoDB documents 6. Commit the transaction 7. Release product inventory lock
  • 55. Making MongoDB and RDBMS relations play nice
  • 57. /** @mongodb:Document(collection="products") */ class Product { /** @mongodb:Id */ private $id; /** @mongodb:String */ private $title; public function getId() { return $this->id; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; } }
  • 59. /** * @orm:Entity * @orm:Table(name="orders") * @orm:HasLifecycleCallbacks */ class Order { /** * @orm:Id @orm:Column(type="integer") * @orm:GeneratedValue(strategy="AUTO") */ private $id; /** * @orm:Column(type="string") */ private $productId; /** * @var DocumentsProduct */ private $product; // ... }
  • 60. So how does an RDBMS have a reference to something outside the database?
  • 61. Setting the Product class Order { // ... public function setProduct(Product $product) { $this->productId = $product->getId(); $this->product = $product; } }
  • 62. • $productId is mapped and persisted • $product which stores the Product instance is not a persistent entity property
  • 64. OrderPostLoadListener use DoctrineORMEventLifecycleEventArgs; class OrderPostLoadListener { public function postLoad(LifecycleEventArgs $eventArgs) { // get the order entity $order = $eventArgs->getEntity(); // get odm reference to order.product_id $productId = $order->getProductId(); $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId); // set the product on the order $em = $eventArgs->getEntityManager(); $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder') ->reflClass->getProperty('product'); $productReflProp->setAccessible(true); $productReflProp->setValue($order, $product); } }
  • 65. All Together Now // Create a new product and order $product = new Product(); $product->setTitle('Test Product'); $dm->persist($product); $dm->flush(); $order = new Order(); $order->setProduct($product); $em->persist($order); $em->flush(); // Find the order later $order = $em->find('Order', $order->getId()); // Instance of an uninitialized product proxy $product = $order->getProduct(); // Initializes proxy and queries the monogodb database echo "Order Title: " . $product->getTitle(); print_r($order);
  • 66. Read more about this technique Jon Wage, one of OpenSky’s engineers, first wrote about this technique on his personal blog: http://jwage.com You can read the full article here: http://jwage.com/2010/08/25/blending-the- doctrine-orm-and-mongodb-odm/
  • 67. http://spf13.com http://github.com/spf13 @spf13 Questions? download at mongodb.org PS: We’re hiring!! Contact us at [email protected]

Editor's Notes

  • #2: \n
  • #3: \n
  • #4: \n
  • #5: \n
  • #6: \n
  • #7: \n
  • #8: \n
  • #9: \n
  • #10: \n
  • #11: \n
  • #12: \n
  • #13: \n
  • #14: \n
  • #15: \n
  • #16: \n
  • #17: \n
  • #18: \n
  • #19: \n
  • #20: \n
  • #21: \n
  • #22: \n
  • #23: \n
  • #24: \n
  • #25: \n
  • #26: \n
  • #27: \n
  • #28: \n
  • #29: \n
  • #30: \n
  • #31: \n
  • #32: \n
  • #33: \n
  • #34: \n
  • #35: \n
  • #36: \n
  • #37: \n
  • #38: \n
  • #39: \n
  • #40: \n
  • #41: \n
  • #42: \n
  • #43: \n
  • #44: \n
  • #45: \n
  • #46: \n
  • #47: \n
  • #48: \n
  • #49: \n
  • #50: \n
  • #51: Given that split, we just happen to have the most boring SQL schema ever\n
  • #52: This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
  • #53: \n
  • #54: Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
  • #55: \n
  • #56: \n
  • #57: \n
  • #58: \n
  • #59: \n
  • #60: And here’s why we like Doctrine so much.\n
  • #61: And here’s why we like Doctrine so much.\n
  • #62: And here’s why we like Doctrine so much.\n
  • #63: This will look a bit like when I bought those shoes.\n
  • #64: This will look a bit like when I bought those shoes.\n
  • #65: This will look a bit like when I bought those shoes.\n
  • #66: This will look a bit like when I bought those shoes.\n
  • #67: This will look a bit like when I bought those shoes.\n
  • #68: This will look a bit like when I bought those shoes.\n
  • #69: This will look a bit like when I bought those shoes.\n
  • #70: \n
  • #71: \n
  • #72: The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
  • #73: \n
  • #74: \n
  • #75: Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
  • #76: This is key: we set both the product id and a reference to the product itself.\n
  • #77: When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
  • #78: \n
  • #79: This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
  • #80: Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n
  • #81: \n
  • #82: \n
  • #83: \n