SlideShare a Scribd company logo
Building Scalable Applications
    for the Modern Cloud

      Presentation @ VCCF 2012 Tech Labs

                 Fotis Stamatelopoulos
     (http://www.linkedin.com/in/fstamatelopoulos)

                   Christos Stathis
         (http://www.linkedin.com/in/chstath)
About this presentation
● This is not a live demo presentation

● It does not present a cloud product

● It focuses instead on
  ○ designing software specifically for the modern cloud
  ○ building scalable applications

● Discusses actual case study implementations
Intro: Typical Cloud models




       image source: http://blog.nexright.com/?p=1
Why/when deploy to the Cloud?
● Financial reasons: pay only what you use
● Scalability: fast & easy addition of resources
● Elasticity: adapt to traffic
● Ideal for SaaS offerings

Rule of thumb:
If your traffic/load is constant
then build your data center instead
Designing SW for the Cloud
● The IaaS environment
  ○   monolithic architectures are not an option
  ○   design to scale-up / down
  ○   monitor everything! have alerts
  ○   handle latency and replication delays


● The PaaS environment
  ○   the cloud handles scaling
  ○   coding / lib limitations
  ○   modular design
  ○   service oriented approach
Designing SW for the Cloud
Examples:
  ● break down your app in multiple server components
     hosted in multiple VMs

  ● use scalable back-ends: your own DB/NoSQL cluster
     or services like GAE datastore, AWS SimpleDB

  ● handle PaaS limitations like:
      ○   maximum request process time
      ○   limitations in Java / Python lib support (GAE)
      ○   no filesystem I/O
      ○   limitations of high availability, distributed services like AWS SQS
Actual Example Case Studies
Case Study 1:
GSS / MyNetworkFolders
● SaaS offering implemented by EBS.gr
● A distributed, scalable file storage platform:
  ○ supporting access via multiple interfaces (web
      browser, mobile devices, desktop, WebDAV)
  ○   provides a RESTful API
  ○   based on our FOSS project gss-project.org (used by
      GRNET Pithos and the Univ of Zagreb)

● It was designed from scratch as a scalable
  cloud application
Building Scalable Cloud Applications - Presentation at VCCF 2012
Building Scalable Cloud Applications - Presentation at VCCF 2012
Design Decisions
● Clustering without session replication
● Stateless, RESTful design
   ○ easier to scale up
   ○ allows building of multiple clients

● No HTTP session / no sticky session
● Multiple levels of caching (client, front-end,
  second-level, etc)
● Use polyglot storage
GSS architecture
Deployment to Amazon AWS

                                    Components hosted in
                                    AWS EC2 instances (VMs)




                                    AWS S3 handles files
                                    (replicated and highly
                                    available storage - PaaS)




                                     EBS volumes backed-up to
                                     S3 used for persistent VM
                                     storage




 will move to a distributed NoSQL
 (mongoDB, PaaS AWS service)
Findings
● Administration effort is minimal
● S3 rules! as a reliable file storage back-end
● With an IaaS providers like AWS you can
  achieve real elasticity
● Availability and stability of services &
  resources is excellent, but you still need:
   ○ Monitoring & alerts is a must
   ○ Have a consistent backup plan
● Fine tuning and good design will save you $$
Case Study 2: EUDOXUS
● Stakeholder: Hellenic Ministry of Education
● Developed, operated and supported by
  GRNET http://grnet.gr/
● A distributed system that supports and
  streamlines processes and operations related
  to the textbooks distribution for the higher
  education institutions of Greece
● Handles ~ 300K named users per semester
                                      http://eudoxus.gr
Design Goals - Challenges
● Multiple APIs and integration with other ISs
● Tight schedule, fixed milestones - deadlines
  ○ Incrementally release modules in production

● Not finalized requirements
  ○ Be flexible, handle changing requirements

● High availability
  ○ Redundant architecture
  ○ Live application updates – no downtime

● Handle fluctuating load & traffic
Design decisions
● Rich Web-based clients (UIs)
  ○ State is handled by the rich clients

● Support various APIs
  ○ RESTFul APIs (JSON / XML payload)
  ○ use SOAP-based web services specific cases

● Back-end storage: RDBMS vs noSQL
  decision
  ○ noSQL offer better scalability than typical RDBMs
  ○ needed at least a minimal transactional core
  ○ dropped the initial hybrid approach - full RDBMS
    design
Design decisions (cont'd)
● Use caching as much as possible, at multiple
  levels to off-load the back-end storage
● Authentication / Credentials
  ○ Stateless mechanism - killed user / http sessions
  ○ Support both
    ■ Shibboleth-based authentication for students
       (identity federation)
    ■ form based login for other user classes
High level functional architecture
Technical architecture
The worker (AS) cluster
The data store
● Single RDBMS (postgresql) instance with
  fallback
● “Plan B” alternatives
  ○ Cold replicas for reports
  ○ Sharding / hot replication (one writer / multiple
    readers)
  ○ Moving parts of the data to NoSQL or even use Solr
    as a NoSQL data store

● Currently handling the load without
  sweating
Findings
● If your user space has an upper bound, you
  can make it with an RDBMS

● Stateless is the way to go

● Use caching at multiple levels and save on
  resources
Questions

More Related Content

What's hot (20)

PDF
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
 
PPTX
State of the Container Ecosystem
Vinay Rao
 
PPTX
Big data and polyglot solutions
Kumaran Ramanujam
 
PDF
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
mfrancis
 
PDF
OpenNebulaConf2017EU: Providing cloud and Managed Hosting Environment by Mich...
OpenNebula Project
 
PPTX
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebula Project
 
PDF
Doing E-commerce Right – Magento on DigitalOcean
DigitalOcean
 
PDF
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio, Inc.
 
PDF
Approaches for duplicating Kubernetes Storage with Gluster
mountpoint.io
 
PDF
OpenNebula and StorPool: Building Powerful Clouds
OpenNebula Project
 
PDF
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio, Inc.
 
PDF
Multi-model databases and node.js
Max Neunhöffer
 
PDF
Netflix Data Benchmark @ HPTS 2017
Ioannis Papapanagiotou
 
PDF
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebula Project
 
PDF
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebula Project
 
PDF
Semantic DESCription as a Service
uji_geotec
 
PDF
Spark Summit EU talk by Jiri Simsa
Spark Summit
 
PDF
Designing for operability and managability
Gaurav Bahrani
 
PDF
Alluxio data orchestration for machine learning
Alluxio, Inc.
 
ODP
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Atin Mukherjee
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
 
State of the Container Ecosystem
Vinay Rao
 
Big data and polyglot solutions
Kumaran Ramanujam
 
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
mfrancis
 
OpenNebulaConf2017EU: Providing cloud and Managed Hosting Environment by Mich...
OpenNebula Project
 
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebula Project
 
Doing E-commerce Right – Magento on DigitalOcean
DigitalOcean
 
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio, Inc.
 
Approaches for duplicating Kubernetes Storage with Gluster
mountpoint.io
 
OpenNebula and StorPool: Building Powerful Clouds
OpenNebula Project
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio, Inc.
 
Multi-model databases and node.js
Max Neunhöffer
 
Netflix Data Benchmark @ HPTS 2017
Ioannis Papapanagiotou
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebula Project
 
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebula Project
 
Semantic DESCription as a Service
uji_geotec
 
Spark Summit EU talk by Jiri Simsa
Spark Summit
 
Designing for operability and managability
Gaurav Bahrani
 
Alluxio data orchestration for machine learning
Alluxio, Inc.
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Atin Mukherjee
 

Similar to Building Scalable Cloud Applications - Presentation at VCCF 2012 (20)

PDF
AWS UG Greece meetup #1
Fotis Stamatelopoulos
 
PDF
CloudCamp Athens presentation: Introduction to cloud computing
Fotis Stamatelopoulos
 
PPTX
Cloud Architecture best practices
Omid Vahdaty
 
PDF
OSDC 2015: John Spray | The Ceph Storage System
NETWAYS
 
PDF
Red Hat Storage Roadmap
Colleen Corrice
 
PDF
Red Hat Storage Roadmap
Red_Hat_Storage
 
PDF
Building EUDOXUS with FOSS
Fotis Stamatelopoulos
 
PDF
Public Cloud Workshop
Amer Ather
 
PDF
2016_04_04_CNI_Spring_Meeting_Microservices
Jason Varghese
 
PDF
Barcamp Macau 2014 - Introduction to AWS
Wong Hoi Sing Edison
 
PDF
Cloud-based Energy Efficient Software
Fotis Stamatelopoulos
 
PPTX
Ghost Environment
PratipD
 
PDF
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
Alluxio, Inc.
 
PDF
Clash of Technologies Google Cloud vs Microsoft Azure
Mihail Mateev
 
PDF
[WSO2Con Asia 2018] Architecting for Container-native Environments
WSO2
 
PDF
Redis Conf 2019--Container Attached Storage for Redis
OpenEBS
 
PPTX
Azure Data Storage
Ken Cenerelli
 
PDF
Introduction to Apache Mesos and DC/OS
Steve Wong
 
PPTX
Microsoft Azure
Pratik Sawant
 
PPTX
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Comunidade NetPonto
 
AWS UG Greece meetup #1
Fotis Stamatelopoulos
 
CloudCamp Athens presentation: Introduction to cloud computing
Fotis Stamatelopoulos
 
Cloud Architecture best practices
Omid Vahdaty
 
OSDC 2015: John Spray | The Ceph Storage System
NETWAYS
 
Red Hat Storage Roadmap
Colleen Corrice
 
Red Hat Storage Roadmap
Red_Hat_Storage
 
Building EUDOXUS with FOSS
Fotis Stamatelopoulos
 
Public Cloud Workshop
Amer Ather
 
2016_04_04_CNI_Spring_Meeting_Microservices
Jason Varghese
 
Barcamp Macau 2014 - Introduction to AWS
Wong Hoi Sing Edison
 
Cloud-based Energy Efficient Software
Fotis Stamatelopoulos
 
Ghost Environment
PratipD
 
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
Alluxio, Inc.
 
Clash of Technologies Google Cloud vs Microsoft Azure
Mihail Mateev
 
[WSO2Con Asia 2018] Architecting for Container-native Environments
WSO2
 
Redis Conf 2019--Container Attached Storage for Redis
OpenEBS
 
Azure Data Storage
Ken Cenerelli
 
Introduction to Apache Mesos and DC/OS
Steve Wong
 
Microsoft Azure
Pratik Sawant
 
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Comunidade NetPonto
 
Ad

Recently uploaded (20)

PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PDF
Women in Automation Presents: Reinventing Yourself — Bold Career Pivots That ...
DianaGray10
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Français Patch Tuesday - Juillet
Ivanti
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Women in Automation Presents: Reinventing Yourself — Bold Career Pivots That ...
DianaGray10
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Français Patch Tuesday - Juillet
Ivanti
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Ad

Building Scalable Cloud Applications - Presentation at VCCF 2012

  • 1. Building Scalable Applications for the Modern Cloud Presentation @ VCCF 2012 Tech Labs Fotis Stamatelopoulos (http://www.linkedin.com/in/fstamatelopoulos) Christos Stathis (http://www.linkedin.com/in/chstath)
  • 2. About this presentation ● This is not a live demo presentation ● It does not present a cloud product ● It focuses instead on ○ designing software specifically for the modern cloud ○ building scalable applications ● Discusses actual case study implementations
  • 3. Intro: Typical Cloud models image source: http://blog.nexright.com/?p=1
  • 4. Why/when deploy to the Cloud? ● Financial reasons: pay only what you use ● Scalability: fast & easy addition of resources ● Elasticity: adapt to traffic ● Ideal for SaaS offerings Rule of thumb: If your traffic/load is constant then build your data center instead
  • 5. Designing SW for the Cloud ● The IaaS environment ○ monolithic architectures are not an option ○ design to scale-up / down ○ monitor everything! have alerts ○ handle latency and replication delays ● The PaaS environment ○ the cloud handles scaling ○ coding / lib limitations ○ modular design ○ service oriented approach
  • 6. Designing SW for the Cloud Examples: ● break down your app in multiple server components hosted in multiple VMs ● use scalable back-ends: your own DB/NoSQL cluster or services like GAE datastore, AWS SimpleDB ● handle PaaS limitations like: ○ maximum request process time ○ limitations in Java / Python lib support (GAE) ○ no filesystem I/O ○ limitations of high availability, distributed services like AWS SQS
  • 8. Case Study 1: GSS / MyNetworkFolders ● SaaS offering implemented by EBS.gr ● A distributed, scalable file storage platform: ○ supporting access via multiple interfaces (web browser, mobile devices, desktop, WebDAV) ○ provides a RESTful API ○ based on our FOSS project gss-project.org (used by GRNET Pithos and the Univ of Zagreb) ● It was designed from scratch as a scalable cloud application
  • 11. Design Decisions ● Clustering without session replication ● Stateless, RESTful design ○ easier to scale up ○ allows building of multiple clients ● No HTTP session / no sticky session ● Multiple levels of caching (client, front-end, second-level, etc) ● Use polyglot storage
  • 13. Deployment to Amazon AWS Components hosted in AWS EC2 instances (VMs) AWS S3 handles files (replicated and highly available storage - PaaS) EBS volumes backed-up to S3 used for persistent VM storage will move to a distributed NoSQL (mongoDB, PaaS AWS service)
  • 14. Findings ● Administration effort is minimal ● S3 rules! as a reliable file storage back-end ● With an IaaS providers like AWS you can achieve real elasticity ● Availability and stability of services & resources is excellent, but you still need: ○ Monitoring & alerts is a must ○ Have a consistent backup plan ● Fine tuning and good design will save you $$
  • 15. Case Study 2: EUDOXUS ● Stakeholder: Hellenic Ministry of Education ● Developed, operated and supported by GRNET http://grnet.gr/ ● A distributed system that supports and streamlines processes and operations related to the textbooks distribution for the higher education institutions of Greece ● Handles ~ 300K named users per semester http://eudoxus.gr
  • 16. Design Goals - Challenges ● Multiple APIs and integration with other ISs ● Tight schedule, fixed milestones - deadlines ○ Incrementally release modules in production ● Not finalized requirements ○ Be flexible, handle changing requirements ● High availability ○ Redundant architecture ○ Live application updates – no downtime ● Handle fluctuating load & traffic
  • 17. Design decisions ● Rich Web-based clients (UIs) ○ State is handled by the rich clients ● Support various APIs ○ RESTFul APIs (JSON / XML payload) ○ use SOAP-based web services specific cases ● Back-end storage: RDBMS vs noSQL decision ○ noSQL offer better scalability than typical RDBMs ○ needed at least a minimal transactional core ○ dropped the initial hybrid approach - full RDBMS design
  • 18. Design decisions (cont'd) ● Use caching as much as possible, at multiple levels to off-load the back-end storage ● Authentication / Credentials ○ Stateless mechanism - killed user / http sessions ○ Support both ■ Shibboleth-based authentication for students (identity federation) ■ form based login for other user classes
  • 19. High level functional architecture
  • 21. The worker (AS) cluster
  • 22. The data store ● Single RDBMS (postgresql) instance with fallback ● “Plan B” alternatives ○ Cold replicas for reports ○ Sharding / hot replication (one writer / multiple readers) ○ Moving parts of the data to NoSQL or even use Solr as a NoSQL data store ● Currently handling the load without sweating
  • 23. Findings ● If your user space has an upper bound, you can make it with an RDBMS ● Stateless is the way to go ● Use caching at multiple levels and save on resources