Ian Foster Computation Institute Argonne National Lab & University of Chicago
Abstract The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
 
“ I’ve been doing cloud computing since before it was called grid.”
1890
1953
“ Computation may someday be organized as a public utility …  The computing utility could become the basis for a new and important industry.” John  McCarthy  (1961)
 
Time Connectivity (on log scale) Science “ When the network is as fast as the computer's    internal links, the machine disintegrates across    the net into a set of special purpose appliances” (George Gilder, 2001) Grid
Application Infrastructure
Layered grid architecture (“The Anatomy of the Grid,” 2001) Application Fabric “ Controlling things locally”: Access to, & control of, resources Connectivity “ Talking to things”: communication (Internet protocols) & security Resource “ Sharing single resources”: negotiating access, controlling use Collective “ Managing multiple resources”: ubiquitous infrastructure services User “ Specialized services”: user- or appln-specific distributed services Internet Transport Application Link Internet Protocol Architecture
Application Infrastructure Service oriented  infrastructure
 
www.opensciencegrid.org
www.opensciencegrid.org
Application Infrastructure Service oriented  infrastructure
Application Service oriented  applications Infrastructure Service oriented  infrastructure
 
As of  Oct 19 , 2008: 122 participants 105   services 70   data 35  analytical
Microarray clustering  using Taverna Query  and retrieve microarray data from a caArray data service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub Normalize  microarray data using GenePattern analytical service  node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService Hierarchical clustering  using geWorkbench analytical service:  cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage Workflow in/output caGrid services “ Shim” services others Wei Tan
Infrastructure Applications
Energy Progress of adoption
Energy Progress of adoption $$ $$ $$
Energy Progress of adoption $$ $$ $$
Time Connectivity (on log scale) Science Enterprise “ When the network is as fast as the computer's    internal links, the machine disintegrates across    the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud
 
 
US$3
Credit: Werner Vogels
Credit: Werner Vogels
Animoto EC2 image usage Day 1 Day 8 0 4000
Software Platform Infrastructure Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
Software Platform Infrastructure Amazon, GoGrid, Sun, Microsoft, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
Software Platform Infrastructure Amazon, GoGrid, Sun, Microsoft, … Amazon, Google, Microsoft, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
 
Dynamo: Amazon’s highly available key-value store (DeCandia et al., SOSP’07) Simple query model Weak consistency, no isolation Stringent SLAs (e.g., 300ms for 99.9% of requests; peak 500 requests/sec) Incremental scalability Symmetry Decentralization Heterogeneity
Technologies used in Dynamo Problem Technique Advantage Partitioning Consistent hashing Incremental scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates Handling temporary failures Sloppy quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information
Application Service oriented  applications Infrastructure Service oriented  infrastructure
The Globus-based LIGO data grid  Birmingham • Replicating >1 Terabyte/day to 8 sites >100 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory Cardiff AEI/Golm
Pull “missing” files to a storage system Data replication service List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP “ Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005  Replica Location Index Data Movement Data Location Data Replication
Specializing further … User D S1 S2 S3 Service Provider “ Provide access to data D at S1, S2, S3 with performance P” Resource Provider “ Provide storage  with performance P1, network with P2, …” D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3
Using IaaS in biomedical informatics My servers Chicago Chicago handle.net BIRN Chicago IaaS provider Chicago BIRN Chicago
Clouds and supercomputers: Conventional wisdom? Too slow Too  expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
Clouds and supercomputers: Conventional wisdom? Good for rapid response Too  expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
Loosely coupled problems Ensemble runs to quantify  climate model uncertainty Identify  potential drug targets  by screening a database of ligand structures against target proteins Study  economic model sensitivity  to parameters Analyze  turbulence dataset  from many perspectives Perform  numerical optimization  to determine optimal resource assignment in energy problems Mine collection of data from  advanced light sources   Construct databases of computed properties of  chemical compounds Analyze data from the  Large Hadron Collider Analyze  log data  from 100,000-node parallel computations
Many many tasks: Identifying potential drug targets 2M+ ligands Protein  x target(s)  (Mike Kubal, Benoit Roux, and others)
start report DOCK6 Receptor (1 per protein: defines pocket to bind to) ZINC 3-D structures ligands complexes NAB script parameters (defines flexible residues,  #MDsteps) Amber Score: 1. AmberizeLigand 3. AmberizeComplex 5. RunNABScript end BuildNABScript NAB Script NAB Script Template Amber prep: 2. AmberizeReceptor 4. perl: gen nabscript FRED Receptor (1 per protein: defines pocket to bind to) Manually prep DOCK6 rec file Manually prep FRED rec file 1  protein (1MB) PDB protein descriptions For 1 target: 4 million tasks 500,000 cpu-hrs (50 cpu-years) 6  GB 2M  structures (6 GB) DOCK6 FRED ~4M x 60s x 1 cpu ~60K cpu-hrs Amber ~10K x 20m x 1 cpu ~3K cpu-hrs Select best ~500 ~500 x 10hr x 100 cpu ~500K cpu-hrs GCMC Select best ~5K Select best ~5K
 
DOCK on BG/P: ~1M tasks on 118,000 CPUs CPU cores: 118784 Tasks: 934803 Elapsed time: 7257 sec Compute time: 21.43 CPU years Average task time: 667 sec Relative Efficiency: 99.7% (from 16 to 32 racks) Utilization:  Sustained: 99.6% Overall: 78.3% GPFS 1 script (~5KB) 2 file read (~10KB) 1 file write (~10KB) RAM (cached from GPFS on first task per node) 1 binary (~7MB) Static input data (~45MB) Ioan Raicu Zhao Zhang Mike Wilde Time (secs)
Managing 160,000 cores Slower shared storage High-speed local “disk” Falkon
Scaling Posix to petascale … . . . Large dataset CN-striped intermediate file system    Torus and tree interconnects   Global file system Chirp (multicast) MosaStore (striping) Staging Intermediate Local LFS Compute node (local datasets) LFS Compute node (local datasets)
Efficiency for 4 second tasks and varying data size (1KB to 1MB) for CIO and GPFS up to 32K processors
“ Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node Ioan Raicu
Same scenario, but with dynamic resource provisioning
Data diffusion sine-wave workload: Summary GPFS      5.70 hrs,  ~8Gb/s,  1138 CPU hrs DD+SRP    1.80 hrs, ~25Gb/s,  361 CPU hrs DD+DRP    1.86 hrs, ~24Gb/s,  253 CPU hrs
Clouds and supercomputers: Conventional wisdom? Good for rapid response Excellent Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
“ The computer revolution hasn’t happened yet.” Alan Kay, 1997
Time Connectivity (on log scale) Science Enterprise Consumer “ When the network is as fast as the computer's    internal links, the machine disintegrates across    the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud ????
Energy Internet The Shape of Grids to Come?
Thank you! Computation Institute www.ci.uchicago.edu

Computing Outside The Box June 2009

  • 1.
    Ian Foster ComputationInstitute Argonne National Lab & University of Chicago
  • 2.
    Abstract The pastdecade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
  • 3.
  • 4.
    “ I’ve beendoing cloud computing since before it was called grid.”
  • 5.
  • 6.
  • 7.
    “ Computation maysomeday be organized as a public utility … The computing utility could become the basis for a new and important industry.” John McCarthy (1961)
  • 8.
  • 9.
    Time Connectivity (onlog scale) Science “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid
  • 10.
  • 11.
    Layered grid architecture(“The Anatomy of the Grid,” 2001) Application Fabric “ Controlling things locally”: Access to, & control of, resources Connectivity “ Talking to things”: communication (Internet protocols) & security Resource “ Sharing single resources”: negotiating access, controlling use Collective “ Managing multiple resources”: ubiquitous infrastructure services User “ Specialized services”: user- or appln-specific distributed services Internet Transport Application Link Internet Protocol Architecture
  • 12.
    Application Infrastructure Serviceoriented infrastructure
  • 13.
  • 14.
  • 15.
  • 16.
    Application Infrastructure Serviceoriented infrastructure
  • 17.
    Application Service oriented applications Infrastructure Service oriented infrastructure
  • 18.
  • 19.
    As of Oct 19 , 2008: 122 participants 105 services 70 data 35 analytical
  • 20.
    Microarray clustering using Taverna Query and retrieve microarray data from a caArray data service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage Workflow in/output caGrid services “ Shim” services others Wei Tan
  • 21.
  • 22.
  • 23.
    Energy Progress ofadoption $$ $$ $$
  • 24.
    Energy Progress ofadoption $$ $$ $$
  • 25.
    Time Connectivity (onlog scale) Science Enterprise “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
    Animoto EC2 imageusage Day 1 Day 8 0 4000
  • 32.
    Software Platform InfrastructureSalesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 33.
    Software Platform InfrastructureAmazon, GoGrid, Sun, Microsoft, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 34.
    Software Platform InfrastructureAmazon, GoGrid, Sun, Microsoft, … Amazon, Google, Microsoft, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 35.
  • 36.
    Dynamo: Amazon’s highlyavailable key-value store (DeCandia et al., SOSP’07) Simple query model Weak consistency, no isolation Stringent SLAs (e.g., 300ms for 99.9% of requests; peak 500 requests/sec) Incremental scalability Symmetry Decentralization Heterogeneity
  • 37.
    Technologies used inDynamo Problem Technique Advantage Partitioning Consistent hashing Incremental scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates Handling temporary failures Sloppy quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information
  • 38.
    Application Service oriented applications Infrastructure Service oriented infrastructure
  • 39.
    The Globus-based LIGOdata grid Birmingham • Replicating >1 Terabyte/day to 8 sites >100 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory Cardiff AEI/Golm
  • 40.
    Pull “missing” filesto a storage system Data replication service List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP “ Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication
  • 41.
    Specializing further …User D S1 S2 S3 Service Provider “ Provide access to data D at S1, S2, S3 with performance P” Resource Provider “ Provide storage with performance P1, network with P2, …” D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3
  • 42.
    Using IaaS inbiomedical informatics My servers Chicago Chicago handle.net BIRN Chicago IaaS provider Chicago BIRN Chicago
  • 43.
    Clouds and supercomputers:Conventional wisdom? Too slow Too expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 44.
    Ed Walker, BenchmarkingAmazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 45.
    Ed Walker, BenchmarkingAmazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 46.
    Ed Walker, BenchmarkingAmazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 47.
    Ed Walker, BenchmarkingAmazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 48.
    D. Nurmi, J.Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 49.
    D. Nurmi, J.Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 50.
    D. Nurmi, J.Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 51.
    D. Nurmi, J.Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 52.
    Clouds and supercomputers:Conventional wisdom? Good for rapid response Too expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 53.
    Loosely coupled problemsEnsemble runs to quantify climate model uncertainty Identify potential drug targets by screening a database of ligand structures against target proteins Study economic model sensitivity to parameters Analyze turbulence dataset from many perspectives Perform numerical optimization to determine optimal resource assignment in energy problems Mine collection of data from advanced light sources Construct databases of computed properties of chemical compounds Analyze data from the Large Hadron Collider Analyze log data from 100,000-node parallel computations
  • 54.
    Many many tasks:Identifying potential drug targets 2M+ ligands Protein x target(s) (Mike Kubal, Benoit Roux, and others)
  • 55.
    start report DOCK6Receptor (1 per protein: defines pocket to bind to) ZINC 3-D structures ligands complexes NAB script parameters (defines flexible residues, #MDsteps) Amber Score: 1. AmberizeLigand 3. AmberizeComplex 5. RunNABScript end BuildNABScript NAB Script NAB Script Template Amber prep: 2. AmberizeReceptor 4. perl: gen nabscript FRED Receptor (1 per protein: defines pocket to bind to) Manually prep DOCK6 rec file Manually prep FRED rec file 1 protein (1MB) PDB protein descriptions For 1 target: 4 million tasks 500,000 cpu-hrs (50 cpu-years) 6 GB 2M structures (6 GB) DOCK6 FRED ~4M x 60s x 1 cpu ~60K cpu-hrs Amber ~10K x 20m x 1 cpu ~3K cpu-hrs Select best ~500 ~500 x 10hr x 100 cpu ~500K cpu-hrs GCMC Select best ~5K Select best ~5K
  • 56.
  • 57.
    DOCK on BG/P:~1M tasks on 118,000 CPUs CPU cores: 118784 Tasks: 934803 Elapsed time: 7257 sec Compute time: 21.43 CPU years Average task time: 667 sec Relative Efficiency: 99.7% (from 16 to 32 racks) Utilization: Sustained: 99.6% Overall: 78.3% GPFS 1 script (~5KB) 2 file read (~10KB) 1 file write (~10KB) RAM (cached from GPFS on first task per node) 1 binary (~7MB) Static input data (~45MB) Ioan Raicu Zhao Zhang Mike Wilde Time (secs)
  • 58.
    Managing 160,000 coresSlower shared storage High-speed local “disk” Falkon
  • 59.
    Scaling Posix topetascale … . . . Large dataset CN-striped intermediate file system  Torus and tree interconnects  Global file system Chirp (multicast) MosaStore (striping) Staging Intermediate Local LFS Compute node (local datasets) LFS Compute node (local datasets)
  • 60.
    Efficiency for 4second tasks and varying data size (1KB to 1MB) for CIO and GPFS up to 32K processors
  • 61.
    “ Sine” workload,2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node Ioan Raicu
  • 62.
    Same scenario, butwith dynamic resource provisioning
  • 63.
    Data diffusion sine-waveworkload: Summary GPFS  5.70 hrs, ~8Gb/s, 1138 CPU hrs DD+SRP  1.80 hrs, ~25Gb/s, 361 CPU hrs DD+DRP  1.86 hrs, ~24Gb/s, 253 CPU hrs
  • 64.
    Clouds and supercomputers:Conventional wisdom? Good for rapid response Excellent Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 65.
    “ The computerrevolution hasn’t happened yet.” Alan Kay, 1997
  • 66.
    Time Connectivity (onlog scale) Science Enterprise Consumer “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud ????
  • 67.
    Energy Internet TheShape of Grids to Come?
  • 68.
    Thank you! ComputationInstitute www.ci.uchicago.edu