SlideShare a Scribd company logo
XSEDE for Computational Research
Jeff Wereszczynski
What is XSEDE?
• NSF Funded project (current allocation: 2016-2021, $115 million)
• Provides access to multiple supercomputers with distinct hardware
configurations
• TACC: Stampede, Wrangler, Ranch
• SDSC: Comet, Data Oasis
• PSC: Bridges
• Stanford: XStream
• IU/TACC: Jetstream
Stampede 2 (18 Petaflops)
Stampede is currently undergoing an upgrade and will include:
§ 4200 ”Knights Landing” nodes (68 cores each, with 4 threads/node, 96 GB
RAM)
§ 1736 Intel Xeon Nodes
§ Omni-path interconnect (100 Gbps)
§ Job limitations
§ 48 hours
§ 80 nodes (5440 cores)
Other TACC Resources
§ Wrangler:
§ Visualization and Data Analytics
§ Fast I/O
§ 132 nodes (10 Ivy Bridge CPUs + 1 NVIDIA K40 GPU)
§ Ranch
§ Long term storage
§ Tape system (currently 61 PB of space)
Comet at SDSC (2 Petaflops)
§ 1984 nodes with Xeon E5-2680v3 processors (24 CPUs/node), 128 GB DDR4
DRAM, and 320 GB of SSD local scratch memory.
§ 36 GPU nodes with 4X NVIDIA K80 GPUs each (soon 36 4xP100 GPU nodes)
§ Large memory nodes with 1.5 TB of DRAM.
§ 56 Gbps FDR InfiniBand
§ Gateway hosting nodes and a Virtual Machine repository.
§ Job Limits:
§ 48 hours
§ 72 nodes (1728 cores)
§ Long term storage provided by Data Oasis
Bridges at PSC
Bridges is made up of four types of nodes:
§ Four Integrity Superdome X servers, which are scale-up products that let
users lock data once into their 12 terabytes of shared memory and then
conduct analytics
§ May be run for up to 14 days
§ 42 HPE ProLiant DL580 servers, each of which has 3 terabytes of shared
memory and provides virtualization and remote visualization.
§ 800 HPE Apollo 2000 nodes, each with 128 gigabytes of shared memory
apiece, servicing capacity workloads.
§ 48 GPU nodes with either K80 or P100 GPUs
XStream at Stanford
GPU Computing System:
§ Each of the 65 nodes has 8 NVIDIA K80 cards or 16 NVIDIA Kepler GPUs,
interconnected through PCI-Express PLX-based switches. Each GPU has 12
GB of GDDR5 memory.
§ Compute nodes also feature 2 Intel Ivy-Bridge 10-core CPUs, 256 GB of
DRAM and 450 GB of local SSD storage.
§ The system features 1.4 PB of Lustre storage (22 GB/s aggregate).
Jetstream at IU/TACC
VM SIZE VCPUS RAM (GB)
LOCAL
STORAGE
(GB)
SU COST PER
HOUR
Tiny 1 2 8 1
Small 2 4 20 2
Medium 6 16 60 6
Large 10 30 120 10
XLarge 22 60 240 22
XX Large 44 120 480 44
Jetstream can be used in several different virtual machine (VM) sizes which
are charged in service units (SUs) based on how much of the total system
resource is used. The table below outlines the VM sizes created for Jetstream.
XSEDE April 2017
Extended Collaborative
Support Service (ECSS)
Support for:
Performance analysis
Petascale optimization
Efficient use of accelerators
I/O optimization
Data analytics
Visualization
Use of XSEDE by science gateways
Workflows
Allocation Types
Trial allocation:
1000 SUs on Comet (1 day turnaround)
Startup:
1k-50k SUs (varies by resource)
Required material: Abstract + CV
Allocation within a few days
Education:
1k-50k SUs (varies by resource, ~1k-1.5k per student)
Required material: Abstract + CV, justification for hardware
Allocation within a few days
Allocation Types: Research• No limit (millions of SUs), one year allocation
request
• Required material:
• Main proposal:
• 10 page max (15 for requests >10 Million
SUs)
• Must justify calculations to be performed
(length/number of runs, scientific results,
etc.)
• If not federally funded, should also justify
the science
• Be sure to list other computational
resources
• Scaling document
• Benchmarking for all codes/systems on
systems you plan to run on.
• Also justify your storage requests!
• Quarterly submissions, allocations begin ~3
months after submission
• Request and justify ECSS support if needed
Figure 3: Scaling of Octanucleosome Simulations on Stampede
Questions?
Contact: jwereszc@iit.edu

More Related Content

PDF
Powering Interactive Analytics with Alluxio and Presto
Alluxio, Inc.
 
PPTX
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
DataStax
 
PDF
Past, present, and future of HPC in life sciences
Erich Birngruber
 
PPTX
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
 
PDF
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
PDF
GTC Tel Aviv: Accelerate Analytics with a GPU Data Frame
Aaron Williams
 
PDF
The Practice of Alluxio in JD.com
Alluxio, Inc.
 
PDF
SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...
Aaron Williams
 
Powering Interactive Analytics with Alluxio and Presto
Alluxio, Inc.
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
DataStax
 
Past, present, and future of HPC in life sciences
Erich Birngruber
 
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
GTC Tel Aviv: Accelerate Analytics with a GPU Data Frame
Aaron Williams
 
The Practice of Alluxio in JD.com
Alluxio, Inc.
 
SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...
Aaron Williams
 

What's hot (19)

PPTX
Reliability Analysis for an Energy-Aware RAID System
Xiao Qin
 
PDF
Open stack @ iiit hyderabad
openstackindia
 
PDF
Apache tajo configuration
Jihoon Son
 
PPTX
Stor simple Event June2014.sven.differt
innobit
 
PPT
An intro to Ceph and big data - CERN Big Data Workshop
Patrick McGarry
 
PDF
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
PDF
Improving Presto performance with Alluxio at TikTok
Alluxio, Inc.
 
PDF
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Karan Singh
 
PPTX
Hadoop Technical Presentation
Erwan Alliaume
 
PDF
Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Ankur Chauhan
 
PDF
CFS: Cassandra backed storage for Hadoop
nickmbailey
 
PPT
Mesos study report 03v1.2
Stefanie Zhao
 
PDF
An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives...
leitia07
 
PDF
Machine Learning & Data Science in the Age of the GPU: Smarter, Faster, Better
IDEAS - Int'l Data Engineering and Science Association
 
PPTX
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Community
 
PDF
Ceph scale testing with 10 Billion Objects
Karan Singh
 
PDF
Ceph at salesforce ceph day external presentation
Sameer Tiwari
 
PDF
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red_Hat_Storage
 
PDF
Taking Your Database Beyond the Border of a Single Kubernetes Cluster
Christopher Bradford
 
Reliability Analysis for an Energy-Aware RAID System
Xiao Qin
 
Open stack @ iiit hyderabad
openstackindia
 
Apache tajo configuration
Jihoon Son
 
Stor simple Event June2014.sven.differt
innobit
 
An intro to Ceph and big data - CERN Big Data Workshop
Patrick McGarry
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
Improving Presto performance with Alluxio at TikTok
Alluxio, Inc.
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Karan Singh
 
Hadoop Technical Presentation
Erwan Alliaume
 
Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center
Ankur Chauhan
 
CFS: Cassandra backed storage for Hadoop
nickmbailey
 
Mesos study report 03v1.2
Stefanie Zhao
 
An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives...
leitia07
 
Machine Learning & Data Science in the Age of the GPU: Smarter, Faster, Better
IDEAS - Int'l Data Engineering and Science Association
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Community
 
Ceph scale testing with 10 Billion Objects
Karan Singh
 
Ceph at salesforce ceph day external presentation
Sameer Tiwari
 
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red_Hat_Storage
 
Taking Your Database Beyond the Border of a Single Kubernetes Cluster
Christopher Bradford
 
Ad

Similar to XSEDE April 2017 (20)

PDF
Building and Extensible Storage Ecosystem with WOS
inside-BigData.com
 
PDF
XSEDE Overview (March 2014)
John Towns
 
PPTX
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
Larry Smarr
 
PPTX
FutureGrid Computing Testbed as a Service
Geoffrey Fox
 
PPTX
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Matthew Vaughn
 
PDF
XSEDE National Cyberinfrastructure, NIST, and Supporting NCSI Objectives
John Towns
 
PDF
XPDDS17: NoXS: Death to the XenStore - Filipe Manco, NEC
The Linux Foundation
 
PDF
Overview of XSEDE Systems Engineering
John Towns
 
PDF
Jetstream - Adding Cloud-based Computing to the National Cyberinfrastructure
inside-BigData.com
 
PDF
opensourceiaas
Todd Deshane
 
PPT
How to Terminate the GLIF by Building a Campus Big Data Freeway System
Larry Smarr
 
PPT
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
Larry Smarr
 
PDF
#VirtualDesignMaster 3 Challenge 1 - Dennis George
vdmchallenge
 
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
PDF
Overview of XSEDE and Introduction to XSEDE 2.0 and Beyond
John Towns
 
PDF
GIST AI-X Computing Cluster
Jax Jargalsaikhan
 
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
Larry Smarr
 
PPTX
vBACD July 2012 - Xen Cloud Platform
CloudStack - Open Source Cloud Computing Project
 
PPTX
BACD July 2012 : The Xen Cloud Platform
The Linux Foundation
 
PDF
Blue Waters and Resource Management - Now and in the Future
inside-BigData.com
 
Building and Extensible Storage Ecosystem with WOS
inside-BigData.com
 
XSEDE Overview (March 2014)
John Towns
 
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
Larry Smarr
 
FutureGrid Computing Testbed as a Service
Geoffrey Fox
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Matthew Vaughn
 
XSEDE National Cyberinfrastructure, NIST, and Supporting NCSI Objectives
John Towns
 
XPDDS17: NoXS: Death to the XenStore - Filipe Manco, NEC
The Linux Foundation
 
Overview of XSEDE Systems Engineering
John Towns
 
Jetstream - Adding Cloud-based Computing to the National Cyberinfrastructure
inside-BigData.com
 
opensourceiaas
Todd Deshane
 
How to Terminate the GLIF by Building a Campus Big Data Freeway System
Larry Smarr
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
Larry Smarr
 
#VirtualDesignMaster 3 Challenge 1 - Dennis George
vdmchallenge
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
Overview of XSEDE and Introduction to XSEDE 2.0 and Beyond
John Towns
 
GIST AI-X Computing Cluster
Jax Jargalsaikhan
 
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
Larry Smarr
 
vBACD July 2012 - Xen Cloud Platform
CloudStack - Open Source Cloud Computing Project
 
BACD July 2012 : The Xen Cloud Platform
The Linux Foundation
 
Blue Waters and Resource Management - Now and in the Future
inside-BigData.com
 
Ad

More from SciCompIIT (9)

PDF
Lois Curfman McInnes Exascale CISC Lecture Jan 2018
SciCompIIT
 
PDF
Shuwang Li Moving Interface Modeling and Computation
SciCompIIT
 
PDF
Wereszczynski Molecular Dynamics
SciCompIIT
 
PDF
Dixon Deep Learning
SciCompIIT
 
PDF
Chun Liu Energetic Variational Intro
SciCompIIT
 
PDF
Xian He Sun Data-Centric Into
SciCompIIT
 
PDF
David Minh Brief Stories 2017 Sept
SciCompIIT
 
PPTX
GridIIT Open Science Grid
SciCompIIT
 
PDF
CISC Introduction
SciCompIIT
 
Lois Curfman McInnes Exascale CISC Lecture Jan 2018
SciCompIIT
 
Shuwang Li Moving Interface Modeling and Computation
SciCompIIT
 
Wereszczynski Molecular Dynamics
SciCompIIT
 
Dixon Deep Learning
SciCompIIT
 
Chun Liu Energetic Variational Intro
SciCompIIT
 
Xian He Sun Data-Centric Into
SciCompIIT
 
David Minh Brief Stories 2017 Sept
SciCompIIT
 
GridIIT Open Science Grid
SciCompIIT
 
CISC Introduction
SciCompIIT
 

Recently uploaded (20)

PPTX
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
PPTX
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
PPTX
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
DOCX
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
PPTX
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
PPTX
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
PDF
Identification of unnecessary object allocations using static escape analysis
ESUG
 
PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PPTX
Unit 4 - Astronomy and Astrophysics - Milky Way And External Galaxies
RDhivya6
 
PPTX
INTRO-TO-CRIM-THEORIES-OF-CRIME-2023 (1).pptx
ChrisFlickIII
 
PPT
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
PDF
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
PPTX
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
PPTX
Laboratory design and safe microbiological practices
Akanksha Divkar
 
PPT
Grade_9_Science_Atomic_S_t_r_u_cture.ppt
QuintReynoldDoble
 
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
Identification of unnecessary object allocations using static escape analysis
ESUG
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
Unit 4 - Astronomy and Astrophysics - Milky Way And External Galaxies
RDhivya6
 
INTRO-TO-CRIM-THEORIES-OF-CRIME-2023 (1).pptx
ChrisFlickIII
 
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
Laboratory design and safe microbiological practices
Akanksha Divkar
 
Grade_9_Science_Atomic_S_t_r_u_cture.ppt
QuintReynoldDoble
 

XSEDE April 2017

  • 1. XSEDE for Computational Research Jeff Wereszczynski
  • 2. What is XSEDE? • NSF Funded project (current allocation: 2016-2021, $115 million) • Provides access to multiple supercomputers with distinct hardware configurations • TACC: Stampede, Wrangler, Ranch • SDSC: Comet, Data Oasis • PSC: Bridges • Stanford: XStream • IU/TACC: Jetstream
  • 3. Stampede 2 (18 Petaflops) Stampede is currently undergoing an upgrade and will include: § 4200 ”Knights Landing” nodes (68 cores each, with 4 threads/node, 96 GB RAM) § 1736 Intel Xeon Nodes § Omni-path interconnect (100 Gbps) § Job limitations § 48 hours § 80 nodes (5440 cores)
  • 4. Other TACC Resources § Wrangler: § Visualization and Data Analytics § Fast I/O § 132 nodes (10 Ivy Bridge CPUs + 1 NVIDIA K40 GPU) § Ranch § Long term storage § Tape system (currently 61 PB of space)
  • 5. Comet at SDSC (2 Petaflops) § 1984 nodes with Xeon E5-2680v3 processors (24 CPUs/node), 128 GB DDR4 DRAM, and 320 GB of SSD local scratch memory. § 36 GPU nodes with 4X NVIDIA K80 GPUs each (soon 36 4xP100 GPU nodes) § Large memory nodes with 1.5 TB of DRAM. § 56 Gbps FDR InfiniBand § Gateway hosting nodes and a Virtual Machine repository. § Job Limits: § 48 hours § 72 nodes (1728 cores) § Long term storage provided by Data Oasis
  • 6. Bridges at PSC Bridges is made up of four types of nodes: § Four Integrity Superdome X servers, which are scale-up products that let users lock data once into their 12 terabytes of shared memory and then conduct analytics § May be run for up to 14 days § 42 HPE ProLiant DL580 servers, each of which has 3 terabytes of shared memory and provides virtualization and remote visualization. § 800 HPE Apollo 2000 nodes, each with 128 gigabytes of shared memory apiece, servicing capacity workloads. § 48 GPU nodes with either K80 or P100 GPUs
  • 7. XStream at Stanford GPU Computing System: § Each of the 65 nodes has 8 NVIDIA K80 cards or 16 NVIDIA Kepler GPUs, interconnected through PCI-Express PLX-based switches. Each GPU has 12 GB of GDDR5 memory. § Compute nodes also feature 2 Intel Ivy-Bridge 10-core CPUs, 256 GB of DRAM and 450 GB of local SSD storage. § The system features 1.4 PB of Lustre storage (22 GB/s aggregate).
  • 8. Jetstream at IU/TACC VM SIZE VCPUS RAM (GB) LOCAL STORAGE (GB) SU COST PER HOUR Tiny 1 2 8 1 Small 2 4 20 2 Medium 6 16 60 6 Large 10 30 120 10 XLarge 22 60 240 22 XX Large 44 120 480 44 Jetstream can be used in several different virtual machine (VM) sizes which are charged in service units (SUs) based on how much of the total system resource is used. The table below outlines the VM sizes created for Jetstream.
  • 10. Extended Collaborative Support Service (ECSS) Support for: Performance analysis Petascale optimization Efficient use of accelerators I/O optimization Data analytics Visualization Use of XSEDE by science gateways Workflows
  • 11. Allocation Types Trial allocation: 1000 SUs on Comet (1 day turnaround) Startup: 1k-50k SUs (varies by resource) Required material: Abstract + CV Allocation within a few days Education: 1k-50k SUs (varies by resource, ~1k-1.5k per student) Required material: Abstract + CV, justification for hardware Allocation within a few days
  • 12. Allocation Types: Research• No limit (millions of SUs), one year allocation request • Required material: • Main proposal: • 10 page max (15 for requests >10 Million SUs) • Must justify calculations to be performed (length/number of runs, scientific results, etc.) • If not federally funded, should also justify the science • Be sure to list other computational resources • Scaling document • Benchmarking for all codes/systems on systems you plan to run on. • Also justify your storage requests! • Quarterly submissions, allocations begin ~3 months after submission • Request and justify ECSS support if needed Figure 3: Scaling of Octanucleosome Simulations on Stampede