SlideShare a Scribd company logo
Data Abstraction for
Large Web Applications
       By Brandon Savage
Who Am I?

•   Software developer at Mozilla working on Socorro

•   Author of the PHP Playbook

•   Former frequent blogger on PHP topics

•   Private pilot in my spare time
Data Abstraction For LARGE
      Web Applications
No magic bullets
Once upon a time...

In a galaxy far far away...
Eventually the web grew up.

     And grew larger.
Most webapps still start as
though they’ll always use a
        database.
We need to change our
      thinking.
Socorro
Socorro Data Sources
•   Postgres

•   REST API (Middleware)

•   Elastic Search

•   Hbase

•   Bugzilla REST API

•   Memcache
A database-centric model just
   doesn’t work anymore.
Solving the problem

•   Separate the use of data from the retrieval of data.

•   Think in terms of actions.

•   Build our applications to be storage agnostic.

•   Use the correct data storage medium.
#1 Separate the use of data
 from the retrieval of data
<?php
class MainPage_Controller {
/* ... */
public function do_something(){
    /* ... */
    $sql = ‘SELECT * FROM database”;
    $results = $this->execute($sql);
    return $this->executeView(‘index’, array(‘results’ => $results));
  }
}
<?php
class Data_Model {
/* ... */
public function get_some_data() {
    $sql = ‘SELECT * FROM database”;
    $results = $this->execute($sql);
    /** process results **/
    return $processedResults
  }
}
Processing the data is a
    separate layer.
<?php
class Data_Model {
/* ... */
  public function getSomeData() {
    $data = $this->adapter->queryData();
!   /** process data here **/
!   return $processedData;
  }
}

class Data_Model_Adapter extends MySQL_Adapter implements Adapter
{
  public function queryData() {
      $sql = ‘SELECT * FROM table’;
      /** turn into common format **/
      return $commonFormatData;
  }
}
Swapping out data sources
  becomes very simple.
A cautionary tale
Move to middleware in Socorro
Make life easier on yourself: do
    it right the first time!
#2 Think in terms of actions.
Actions move beyond SELECT,
INSERT, UPDATE and DELETE.
Domain Modeling:
“What are you modeling?”
What do I want?
      What do I need?
What does this data represent?
Django Models:
   One model per table.
All methods relate to SQL.
       That sucks.
<?php
abstract class User_Model {

public function loadUser();

public function authenticateUser();

public function showUserPhones();

}
#3 Build our applications to be
       storage agnostic
Use a standard data format
stdClass()
Create custom objects for
 typehinting or additional
         methods
Avoid expecting built-ins like
    PDOStatement and
  MongoCursor outside
       retrieval layer
#4 Use the correct
 storage medium.
Example: memcache isn’t for
    long-term storage.
Example: MongoDB is not for
   relational data storage.
Relational data goes in relational
           databases!
Choose the correct NoSQL
 database for your needs.
Availability, reliability, and
      consistency.

         Pick two.
Consider data storage that isn’t
      a database at all.
Alternative data options

•   Elastic Search

•   Redis

•   S3

•   The File System (Yes! It still exists!)
Fix it now or fix it later.

But you will have to fix it.
Question time

More Related Content

What's hot (20)

PPTX
Advance java session 16
Smita B Kumar
 
PDF
Web Programming - 5 Passing and Request Data
AndiNurkholis1
 
PPTX
Survey on NoSQL integration
Luiz Henrique Zambom Santana
 
PDF
Entities in Drupal 8 - Drupal Tech Talk - Bart Feenstra
Triquanta
 
PPTX
Building nTier Applications with Entity Framework Services (Part 1)
David McCarter
 
PDF
An Introduction to Spring Data
Oliver Gierke
 
PPTX
[Mas 500] Data Basics
rahulbot
 
PPTX
Entity Framework Database and Code First
James Johnson
 
PPTX
#MongoDB indexes
Daniele Graziani
 
PPTX
NOSQL vs SQL
Mohammed Fazuluddin
 
PPT
Mysql
guest817344
 
PPS
SQL & NoSQL
Ahmad Awsaf-uz-zaman
 
PDF
01 nosql and multi model database
Mahdi Atawneh
 
PPTX
Big data technologies and databases
HariniA7
 
ODP
Spring Test DBUnit
Jaran Flaath
 
PDF
Do’s and don’ts of a hybrid environment
Rick Vasquez
 
PPT
Tech Gupshup Meetup On MongoDB - 24/06/2016
Mukesh Tilokani
 
PPTX
Appache Cassandra
nehabsairam
 
PPTX
SQL vs NoSQL
Jacinto Limjap
 
PDF
Multi model-databases
Michael Hackstein
 
Advance java session 16
Smita B Kumar
 
Web Programming - 5 Passing and Request Data
AndiNurkholis1
 
Survey on NoSQL integration
Luiz Henrique Zambom Santana
 
Entities in Drupal 8 - Drupal Tech Talk - Bart Feenstra
Triquanta
 
Building nTier Applications with Entity Framework Services (Part 1)
David McCarter
 
An Introduction to Spring Data
Oliver Gierke
 
[Mas 500] Data Basics
rahulbot
 
Entity Framework Database and Code First
James Johnson
 
#MongoDB indexes
Daniele Graziani
 
NOSQL vs SQL
Mohammed Fazuluddin
 
01 nosql and multi model database
Mahdi Atawneh
 
Big data technologies and databases
HariniA7
 
Spring Test DBUnit
Jaran Flaath
 
Do’s and don’ts of a hybrid environment
Rick Vasquez
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Mukesh Tilokani
 
Appache Cassandra
nehabsairam
 
SQL vs NoSQL
Jacinto Limjap
 
Multi model-databases
Michael Hackstein
 

Viewers also liked (20)

PDF
Applications for the Enterprise with PHP (CPEurope)
Robert Lemke
 
PDF
Beyond MVC: from Model to Domain
Jeremy Cook
 
PDF
Software Engineering In PHP
Ralph Schindler
 
PPTX
PHP deployment, 2016 flavor - cakefest 2016
Quentin Adam
 
PDF
Advanced PHP: Design Patterns - Dennis-Jan Broerse
dpc
 
PPTX
Proved PHP Design Patterns for Data Persistence
Gjero Krsteski
 
PPTX
Taming the resource tiger
Elizabeth Smith
 
PPT
Building Data Mapper PHP5
Vance Lucas
 
PDF
Asynchronous I/O in PHP
Thomas Weinert
 
PDF
Driving Design through Examples
CiaranMcNulty
 
PPTX
PHP Strings and Patterns
Henry Osborne
 
PDF
Some REST Design Patterns (and Anti-Patterns) - SOA Symposium 2009
Cesare Pautasso
 
PDF
Enterprise PHP: mappers, models and services
Aaron Saray
 
PDF
ORM: Object-relational mapping
Abhilash M A
 
PDF
Elegant Ways of Handling PHP Errors and Exceptions
ZendCon
 
PDF
Design Patterns avec PHP 5.3, Symfony et Pimple
Hugo Hamon
 
PDF
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Aaron Saray
 
PDF
Database Design Patterns
Hugo Hamon
 
PDF
Patterns of Enterprise Application Architecture (by example)
Paulo Gandra de Sousa
 
PPTX
Writing and using php streams and sockets
Elizabeth Smith
 
Applications for the Enterprise with PHP (CPEurope)
Robert Lemke
 
Beyond MVC: from Model to Domain
Jeremy Cook
 
Software Engineering In PHP
Ralph Schindler
 
PHP deployment, 2016 flavor - cakefest 2016
Quentin Adam
 
Advanced PHP: Design Patterns - Dennis-Jan Broerse
dpc
 
Proved PHP Design Patterns for Data Persistence
Gjero Krsteski
 
Taming the resource tiger
Elizabeth Smith
 
Building Data Mapper PHP5
Vance Lucas
 
Asynchronous I/O in PHP
Thomas Weinert
 
Driving Design through Examples
CiaranMcNulty
 
PHP Strings and Patterns
Henry Osborne
 
Some REST Design Patterns (and Anti-Patterns) - SOA Symposium 2009
Cesare Pautasso
 
Enterprise PHP: mappers, models and services
Aaron Saray
 
ORM: Object-relational mapping
Abhilash M A
 
Elegant Ways of Handling PHP Errors and Exceptions
ZendCon
 
Design Patterns avec PHP 5.3, Symfony et Pimple
Hugo Hamon
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Aaron Saray
 
Database Design Patterns
Hugo Hamon
 
Patterns of Enterprise Application Architecture (by example)
Paulo Gandra de Sousa
 
Writing and using php streams and sockets
Elizabeth Smith
 
Ad

Similar to Data Abstraction for Large Web Applications (20)

PPT
Spring data presentation
Oleksii Usyk
 
PDF
Elements for an iOS Backend
Laurent Cerveau
 
PDF
Minerva: Drill Storage Plugin for IPFS
BowenDing4
 
PDF
Data access
Joshua Yoon
 
PDF
4_59788783hhhhhhhhhhhhhhhhhhhhhhhhhhhhh34715564451.pdf
kassyemariyam21
 
PDF
Drupal performance and scalability
Twinbit
 
PPT
Java Developers, make the database work for you (NLJUG JFall 2010)
Lucas Jellema
 
PDF
Staying Sane with Drupal NEPHP
Oscar Merida
 
PPTX
BackboneJS Training - Giving Backbone to your applications
Joseph Khan
 
PPTX
Microsoft Entity Framework
Mahmoud Tolba
 
PPTX
Machine Learning with ML.NET and Azure - Andy Cross
Andrew Flatters
 
PDF
[2015/2016] Local data storage for web-based mobile apps
Ivano Malavolta
 
PPTX
CrawlerLD - Distributed crawler for linked data
Raphael do Vale
 
PDF
Java Web Programming on Google Cloud Platform [2/3] : Datastore
IMC Institute
 
PPTX
La sql
James Johnson
 
PPTX
Being RDBMS Free -- Alternate Approaches to Data Persistence
David Hoerster
 
PDF
Core data WIPJam workshop @ MWC'14
Diego Freniche Brito
 
PPTX
Spring Data - Intro (Odessa Java TechTalks)
Igor Anishchenko
 
PDF
Midao JDBC presentation
Zachar Prychoda
 
PPTX
Dao example
myrajendra
 
Spring data presentation
Oleksii Usyk
 
Elements for an iOS Backend
Laurent Cerveau
 
Minerva: Drill Storage Plugin for IPFS
BowenDing4
 
Data access
Joshua Yoon
 
4_59788783hhhhhhhhhhhhhhhhhhhhhhhhhhhhh34715564451.pdf
kassyemariyam21
 
Drupal performance and scalability
Twinbit
 
Java Developers, make the database work for you (NLJUG JFall 2010)
Lucas Jellema
 
Staying Sane with Drupal NEPHP
Oscar Merida
 
BackboneJS Training - Giving Backbone to your applications
Joseph Khan
 
Microsoft Entity Framework
Mahmoud Tolba
 
Machine Learning with ML.NET and Azure - Andy Cross
Andrew Flatters
 
[2015/2016] Local data storage for web-based mobile apps
Ivano Malavolta
 
CrawlerLD - Distributed crawler for linked data
Raphael do Vale
 
Java Web Programming on Google Cloud Platform [2/3] : Datastore
IMC Institute
 
Being RDBMS Free -- Alternate Approaches to Data Persistence
David Hoerster
 
Core data WIPJam workshop @ MWC'14
Diego Freniche Brito
 
Spring Data - Intro (Odessa Java TechTalks)
Igor Anishchenko
 
Midao JDBC presentation
Zachar Prychoda
 
Dao example
myrajendra
 
Ad

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 

Data Abstraction for Large Web Applications

Editor's Notes

  • #2: \n
  • #3: \n
  • #4: \n
  • #5: \n
  • #6: Years and years ago, when the web was young, state was maintained simply by the creation of a database. Web applications were mostly small, and databases could easily handle the traffic that was being sent their way. Most of us learned how to write web applications against a database. Most of us used the &amp;#x201C;LAMP stack&amp;#x201D; or Linux Apache MySQL PHP.\n
  • #7: As the web grew up, and grew bigger, methods for obtaining, storing and using data changed.\n\nDevelopers began using data sources provided by others, first over SOAP then REST. Other data stores like NoSQL, Redis, Elastic Search and Memcache came along to complicate things. \n\nIt was no longer all about the database. The database was just one piece of the puzzle.\n
  • #8: Yet if we take a good look at most of the frameworks available, they&amp;#x2019;re database-centric. For a long time, Doctrine support for other data layers was non-existent. Support for something other than a database in Django is non-existent. We still think in a database-centric way. Or data layers are still database-focused.\n
  • #9: The bottom line: we need to change our thinking.\n\nDatabases are not it. Even for applications that start against a database (and that&amp;#x2019;s most if not all of them), we need to think about the other ways that we&amp;#x2019;ll ingest data.\n
  • #10: This lesson was painful for those of us working on Socorro. Initially built as a database-centric application we&amp;#x2019;ve slowly expanded our technology stack as new needs have arisen. While much of our webapp data comes from Postgres, we&amp;#x2019;ve begun a process of moving our data layer to a more source-agnostic middleware layer.\n
  • #11: \n
  • #12: It&amp;#x2019;s clear for us that a database centric model doesn&amp;#x2019;t work anymore. We can&amp;#x2019;t think of data in concepts of rows and columns. It doesn&amp;#x2019;t work like that. \n\nSo how do we solve this problem?\n
  • #13: \n
  • #14: Large web applications don&amp;#x2019;t pursue abstraction as an art form. They pursue it as a necessity. Failing to properly abstract a large web application can result in catastrophic failure. It is therefore important to abstract the layer that gets data from a data storage unit from the layers that use the data.\n\nHere&amp;#x2019;s an example...\n
  • #15: When programmers are in a hurry they often don&amp;#x2019;t take the time to abstract their code in a way that makes it easy to come along later and make changes. I&amp;#x2019;ve seen this example hundreds of time in codebases I&amp;#x2019;ve worked on; many of you probably have too. But the problem here is that if ever the data source changes from some SQL-based database to something else, a programmer will have to rewrite the logic here and everywhere else all over again. This makes the cost of transition much higher than it has to be.\n
  • #16: When programmers are in a hurry they often don&amp;#x2019;t take the time to abstract their code in a way that makes it easy to come along later and make changes. I&amp;#x2019;ve seen this example hundreds of time in codebases I&amp;#x2019;ve worked on; many of you probably have too. But the problem here is that if ever the data source changes from some SQL-based database to something else, a programmer will have to rewrite the logic here and everywhere else all over again. This makes the cost of transition much higher than it has to be.\n
  • #17: It would make good sense to therefore abstract the process of \n
  • #18: We should instead use adapters to query the data and return it in an agreed upon format. The processing takes place elsewhere.\n
  • #19: \n
  • #20: NAP story. Data layer Postgres focused.\n
  • #21: When the retrieval and processing are combined, it makes it that much harder to remove one from the other in the future.\n
  • #22: \n
  • #23: \n
  • #24: \n
  • #25: \n
  • #26: \n
  • #27: \n
  • #28: When you think in terms of actions, rather than data sources, you don&amp;#x2019;t care what happens behind the scenes. Instead, you start caring about the finished product. In Socorro, we have reports that use both Hbase and Postgres data. If we cared about the data source, we&amp;#x2019;d have many more calls than we need.\n
  • #29: \n
  • #30: If we use JSON as a standard data format throughout our app, we can construct generic objects easily without worrying about what methods are automatically available to us.\n
  • #31: \n
  • #32: Rather than relying upon model-constructed or ORM-built objects, we should create our own when and if the need arises. \n
  • #33: It&amp;#x2019;s okay to process the results from a database query into some standard format or create an object using the data. But once the data has been retrieved, it should be pushed into a standard format that can be used in the app without caring about what the data source was.\n
  • #34: Developers are drawn to things that are new, cool, or otherwise unique and special. But it&amp;#x2019;s important to use the correct storage medium for development.\n
  • #35: \n
  • #36: \n
  • #37: \n
  • #38: Socorro uses ElasticSearch (not a NoSQL database) and Hbase. We should have used Cassandra, but we have Hbase instead.\n
  • #39: \n
  • #40: External APIs, the file system, all are valid data storage mechanisms. Just because we write database-driven applications doesn&amp;#x2019;t mean our data storage has to be entirely a database. A REST API to an external resource is a valid data storage mechanism, that isn&amp;#x2019;t database-driven (at least as far as your app is concerned).\n
  • #41: \n
  • #42: \n
  • #43: \n