SlideShare a Scribd company logo
Exascale Computing Project - Driving a HUGE Change in a Changing World
2 Exascale Computing Project, www.exascaleproject.org
Leverages the planned DOE facilities acquisitions
◊ 2017 CORAL (collaboration of ORNL, ANL, LLNL)
◊ 2020 APEX (collaboration of LANL/SNL, NERSC)
◊ 2022 CORAL
◊ 2024 APEX
Exascale Computing Project - Lift the entire HPC ecosystem and
enable continued U.S. leadership in HPC
Time (CY)
Capability
2017 2021 2022 2023 2024 2025 2026 2027
10X
5X
3 Exascale Computing Project, www.exascaleproject.org
Reaching the Elevated Trajectory will require
solving key exascale challenges
• Extreme Parallelism
– For example, an Exaflop @ 1 GHz requires a billion threads executing
• Memory and Storage
– BW, latency, and capacity are not scaling with flops
• Reliability
– Energy saving techniques and number of components drive MTBF down
• Energy Consumption
– 20MW per Exaflop has been a target since 2009
In addition, the exascale advanced architecture will need to solve
emerging data science and machine learning problems in addition
to the traditional modeling and simulations applications.
4 Exascale Computing Project, www.exascaleproject.org
Radical, Novel, Advanced solutions are
not a Requirement but may be needed
We want the vendors to propose what they see as being needed to meet
performance, reliability, programmability, data science convergence, and
power requirements.
• If vendors can meet the requirements without needing new radical
solutions that is fine and likely preferred.
• If it involves radical new concepts, we are interested in hearing about
these solutions.
• We want to encourage vendors to propose new ideas where they
provide a path for addressing our requirements but we don’t need
novelty or “advancedness” just so we can claim things are “advanced”.
5 Exascale Computing Project, www.exascaleproject.org
Goals of the Exascale Computing Project
Develop scientific,
engineering, and large-
data applications that
exploit the emerging,
exascale-era
computational trends
caused by the end of
Dennard scaling and
Moore’s law
Foster application
development
Create software that
makes exascale
systems usable
by a wide variety
of scientists
and engineers across
a range of applications
Ease
of use
Enable exascale by
2021 and by 2023
at least two diverse
computing platforms
with up to 50× more
computational
capability than today’s
20 PF systems, within
a similar size, cost,
and power footprint
Rich exascale
ecosystem
Help ensure continued
U.S. leadership
in architecture,
software and
applications to support
scientific discovery,
energy assurance,
stockpile stewardship,
and nonproliferation
programs and policies
US HPC
leadership
6 Exascale Computing Project, www.exascaleproject.org
The ECP Plan
• Use a holistic/co-design approach across four focus areas:
– Application Development
– Software Technology
– Hardware Technology
– Exascale Systems
• Enable an initial exascale system to be delivered in 2021 (power
consumption and reliability requirements may be relaxed)
• Enable capable exascale systems to be delivered in 2022 as part of
the CORAL NNSA and SC facility upgrades
• System acquisitions and costs are outside of the ECP plan, and will
be carried out by DOE-SC and NNSA-ASC facilities
7 Exascale Computing Project, www.exascaleproject.org
ECP Timeline has Two Phases – and ends 2022
R&D before facilities
select exascale systems
Targeted development for
known exascale architectures
2016 2017 2018 2019 2020 2021 2022 2023 20252024FY 2026
Exascale System #1Site Prep #1
Testbeds
Hardware Technology
Software Technology
Application Development
Facilities activities
outside ECP
NRE System #1
NRE System #2
Exascale System #2Site Prep #2
8 Exascale Computing Project, www.exascaleproject.org
What about the 2021 System?
• The site of the 2021 system is TBD and will be decided by DOE around
June 2017.
• If the site is one of the CORAL labs, then the CORAL RFP will state:
“Within the goal of having three capable exascale systems by 2022-
2023, if an early exascale system can be delivered in 2021 and
upgraded to a capable exascale system by 2023, then provide the
upgrade as an option.”
• If the site of the 2021 system is outside the CORAL labs then (in
addition to the CORAL RFP for three 2022 systems) a separate RFP
for a single 2021 system will be released in 2018 by the chosen lab.
9 Exascale Computing Project, www.exascaleproject.org
What is a capable exascale computing system?
ECP defines a capable exascale system as a supercomputer that
• Can solve science problems 50x faster (or more complex, for example,
more physics, higher fidelity) than the 20 PF systems of today can solve
comparable problems.
• Must use a software stack that meets the needs of a broad spectrum of
applications and workloads
• Have a power envelope of 20-30 MW
• Must be sufficiently resilient such that user intervention due to hardware
or system faults is required on the order of a week.
10 Exascale Computing Project, www.exascaleproject.org
Diversity is Very Important to DOE
• In 2018 a single CORAL RFP will be released for delivery of three capable
exascale systems by the 2022-2023 timeframe. The RFP will also include NRE
for the systems.
• These systems will have to be designed to solve emerging data science and
machine learning problems in addition to the traditional modeling and
simulations applications.
• The DOE Leadership Computing Facility has a requirement that the ANL and
ORNL systems must have diverse architectures.
• Given the ECP goal of fostering a rich exascale ecosystem, LLNL has the
option to choose a system that is diverse from both the ANL and ORNL
systems.
11 Exascale Computing Project, www.exascaleproject.org
There are Many Types of System Diversity
• Systems can vary from one another in many different dimensions
– System (architecture, interconnect, IO subsystem, density, resilience, etc.)
– Node (heterogeneous, homogeneous, memory and processor architectures, etc.)
– Software (HPC stack, OS, IO, file system, prog environment, admin tools, etc.)
– Hardware e.g.
• Ways Systems can be diverse
– Few big differences
– Many little differences
– Different technologies
– Different ecosystems, i.e., vendors involved
technology
scale
type
DDR NV PIM
size DIMM
Memory
on die
stacked
Fat thin accel
#cores
homo
Processor
hetero
topologies
perf
optical
Network
copper
12 Exascale Computing Project, www.exascaleproject.org
How Diverse is Enough?
How diverse is enough? There is no hard metric, Labs will evaluate
diversity by how much it will benefit the exascale ecosystem
Having system diversity provides many advantages.
• It promotes price competition, which increases the value to DOE.
• It promotes a competition of ideas and technologies, which helps
provide more capable systems for DOE’s mission needs.
• It reduces risk that may be caused by delays or failure of a particular
technology or shifts in vendor business focus, staff or financial health.
• It helps promote a rich and healthy high performance computing
ecosystem, which is important for national competitiveness and
DOE’s strategic plan.
13 Exascale Computing Project, www.exascaleproject.org
The ECP holistic approach
uses co-design and integration to achieve capable exascale
Application Development
Software
Technology
Hardware
Technology
Exascale
Systems
Scalable and
productive software
stack
Science and mission
applications
Hardware technology
elements
Integrated exascale
supercomputers
Correctness Visualization Data Analysis
Applications Co-Design
Programming models,
development environment,
and runtimes
Tools
Math libraries
and Frameworks
System Software,
resource management
threading, scheduling,
monitoring, and control
Memory
and Burst
buffer
Data
management
I/O and file
system
Node OS, runtimes
Resilience
Workflows
Hardware interface
Co-design centers
Proxy apps
Integration of NNSA
and Office of Science
SW efforts
PathForward
Design Space Evaluation
Testbeds
NRE
14 Exascale Computing Project, www.exascaleproject.org
Application Scope determined by Mission Needs
• Materials discovery and design
• Climate science
• Nuclear energy
• Combustion science
• Large-data applications
• Fusion energy
• National security
• Additive manufacturing
• Many others!
• Stockpile Stewardship Annual
Assessment and Significant
Finding Investigations
• Robust uncertainty quantification
(UQ) techniques in support
of lifetime extension programs
• Understanding evolving
nuclear threats posed by
adversaries and in developing
policies to mitigate these threats
• Discover and characterize
next-generation materials
• Systematically understand
and improve chemical processes
• Analyze the extremely large
datasets resulting from the next
generation of particle physics
experiments
• Extract knowledge from systems-
biology studies of the microbiome
• Advance applied energy
technologies (e.g., whole-device
models of plasma-based fusion
systems)
Key science and technology
challenges to be addressed
with exascale
Meet national
security needs
Support DOE science
and energy missions
15 Exascale Computing Project, www.exascaleproject.org
ECP Application Development – (1/3)
Climate
(BER)
Accurate
regional impact
assessment of
climate
change*
Combustion
(BES)
Design high-
efficiency, low-
emission
combustion
engines and
gas turbines*
Chemical
Science (BES,
BER)
Biofuel
catalysts
design; stress-
resistant crops
Fundamental
Laws (NP)
QCD-based
elucidation of
fundamental
laws of nature:
Standard
Model
validation and
beyond SM
discoveries
Materials
Science (BES)
Find, predict,
and control
materials and
properties:
Applications chosen based on National impact and DOE Offices priorities
16 Exascale Computing Project, www.exascaleproject.org
ECP Application Development – (2/3)
Genomics
(BES)
Protein
structure and
dynamics; 3D
molecular
structure
design of
engineering
functional
properties*
Precision
Medicine for
Cancer (NIH)
Accelerate and
translate
cancer
research in
RAS pathways,
drug
responses, and
treatment
strategies*
Seismic
(EERE, NE,
NNSA)
Reliable
earthquake
hazard and risk
assessment in
relevant
frequency
ranges*
treaty verification
 
assembled within the limitations of shared memory hardware, in addition to making feasible the assembly                             
of several thousand metagenomic samples of DOE relevance available at NCBI [40]. 
 
Figure 1: NCBI Short Read Archive (SRA) and               
HipMer capability growth over time, based on rough               
order­of­magnitude estimates for 1% annual compute           
allocation (terabases, log scale).  
 
Figure 2. Current (green area) and projected (pink               
area) scale of metagenomics data and           
exascale­enabled analysis. 
 
Furthermore, the need for efficient and scalable de novo metagenome sequencing and analysis will only                             
become greater as these datasets continue to grow both in volume and number, and will require exascale                                 
level computational resources to handle the roughly doubling of metagenomic samples/experiments every                       
Metagenomic
(BER)
Leveraging
microbial
diversity in
metagenomic
datasets for
new products
and life forms*
Chemical
Science (BES)
Design
catalysts for
conversion of
cellulosic-
based
chemicals into
fuels,
bioproducts
Some applications also include a significant machine learning component *
17 Exascale Computing Project, www.exascaleproject.org
ECP Applications Development – (3/3)
* Scope includes a significant data science component
Demystify
origin of
universe
and nuclear
matter
in universe*
Astrophysics
(NP)
Cosmology
(HEP)
Cosmological
probe of
standard model
(SM) of particle
physics:
Inflation, dark
matter, dark
energy*
Magnetic
Fusion
Energy (FES)
Predict and
guide stable
ITER
operational
performance
with an
integrated
whole device
model*
Nuclear
Energy (NE)
Accelerate
design and
commercialization
of next-generation
small modular
reactors*
Wind Energy
(EERE)
Increase
efficiency and
reduce cost of
turbine wind
plants sited in
complex
terrains*
Some applications also include a significant data science component *
18 Exascale Computing Project, www.exascaleproject.org
ECP Application Development Co-Design Centers
• Center for Online Data Analysis and Reduction at the Exascale (CODAR)
• Block-Structured AMR Co-Design Center (AMReX)
• Center for Efficient Exascale Discretizations (CEED)
• Co-Design Center for Particle Applications (CoPA)
• Graph and Combinatorial Methods for Enabling Exascale Applications
(GraphEx)
19 Exascale Computing Project, www.exascaleproject.org
ECP Software Technology Summary
• ECP will build a comprehensive and coherent software stack that will enable
application developers to productively write highly parallel applications
that can portably target diverse exascale architectures
• ECP will accomplish this by extending current technologies to exascale
where possible, performing R&D required to conceive of new approaches where
necessary, coordinating with vendor efforts, and developing and deploying
high-quality and robust software products
20 Exascale Computing Project, www.exascaleproject.org
ECP Hardware Technology Summary
Objective: Fund R&D to design hardware that meets ECP’s Targets
for application performance, power efficiency, and resilience
• Issue PathForward and PathForward-II Hardware Architecture R&D contracts
• Participate in evaluation and review of PathForward and LeapForward
deliverables
• Lead Design Space Evaluation through Architectural Analysis, and Abstract
Machine Models of PathForward/PathForward-II designs for ECP’s holistic
co-design
21 Exascale Computing Project, www.exascaleproject.org
Goals for PathForward (issued last year – 6 vendor awards pending)
• Improve the quality and number of competitive offeror responses to the
Capable Exascale Systems RFP
• Improve the offeror’s confidence in the value and feasibility of aggressive
advanced technology options that would be bid in response to the Capable
Exascale Systems RFP
• Improve DOE confidence in technology performance benefit,
programmability and ability to integrate into a credible system platform
acquisition
22 Exascale Computing Project, www.exascaleproject.org
Goals of PathForward-II (planned for issue in 2017)
• Support high payoff, innovative hardware technologies and systems
technologies that may have higher risk. It is focused on component, node, and
system architecture designs that will intersect with the 2021 exascale system.
• Also of interest to the PathForward-II RFP team:
– Innovations that may enable dramatic acceleration of certain applications,
for example, delivering a 100x increase in 2021 on some classes of
applications while still being able to solve the full range of DOE applications
– Developments that promote wider diversity in the exascale ecosystem
– Innovations in power consumption, performance, programmability, reliability,
data science, machine learning, or portability
– Reducing total cost of ownership
23 Exascale Computing Project, www.exascaleproject.org
ECP Exascale Systems Summary
• Funds Non-Recurring Engineering (NRE)
– Brings to the product stage promising hardware and software research and
integrates it into a system
– Includes application readiness R&D efforts
– Must start early enough to impact the system - more than two full years of lead time
are necessary to maximize impact
• Funds Testbeds
– ECP ECP testbeds will be deployed each year throughout the project
– FY17 testbeds will be acquired through options on existing contracts at Argonne and
ORNL
– Testbed architectures will track SC/NNSA system acquisitions and other promising
architectures
24 Exascale Computing Project, www.exascaleproject.org
This is a very exciting time for computing in the US
• Unique opportunity to do something HUGE for the nation in HPC
• The exascale systems in 2021 and 2022 afford the opportunity for
– More rapid advancement and scaling of mission and science applications
– More rapid advancement and scaling of an exascale software stack
– Rapid investments in vendor technologies and software needed for 2021 and 2022 systems
– More rapid progress in numerical methods and algorithms for advanced architectures
– Strong leveraging of and broader engagement with US computing capability
• When ECP ends, we will have
– Prepared industry and critical applications for a more diverse and sophisticated set of
computing technologies, carrying US supercomputing well into the future
– Demonstrated integrated software stack components at exascale
– Invested in the engineering and development, and participated in acquisition and testing of
capable exascale systems
www.ExascaleProject.org
Thank you!
www.ExascaleProject.org

More Related Content

What's hot (20)

PDF
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
 
PDF
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
PDF
IBM Data Centric Systems & OpenPOWER
inside-BigData.com
 
PDF
FPGA-accelerated High-Performance Computing – Close to Breakthrough or Pipedr...
Christian Plessl
 
PDF
10 Abundant-Data Computing
RCCSRENKEI
 
PDF
13 Supercomputer-Scale AI with Cerebras Systems
RCCSRENKEI
 
PDF
ExaLearn Overview - ECP Co-Design Center for Machine Learning
inside-BigData.com
 
PPTX
OpenACC Monthly Highlights Summer 2019
OpenACC
 
PPTX
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
inside-BigData.com
 
PDF
The Coming Age of Extreme Heterogeneity in HPC
inside-BigData.com
 
PPTX
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 
PPTX
OpenACC Monthly Highlights: July 2021
OpenACC
 
PDF
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
PPTX
Introducing the TPCx-HS Benchmark for Big Data
inside-BigData.com
 
PDF
ExtremeEarth Data Science Pipeline for Linked Earth Observation Data
ExtremeEarth
 
PDF
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
ExtremeEarth
 
PPTX
OpenACC Monthly Highlights: June 2021
OpenACC
 
PDF
Geofizyka Krakow Selects Panasas for Simplicity and Performance
Panasas
 
PPTX
OpenACC Monthly Highlights: January 2021
OpenACC
 
PDF
ARM HPC Ecosystem
inside-BigData.com
 
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
 
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
IBM Data Centric Systems & OpenPOWER
inside-BigData.com
 
FPGA-accelerated High-Performance Computing – Close to Breakthrough or Pipedr...
Christian Plessl
 
10 Abundant-Data Computing
RCCSRENKEI
 
13 Supercomputer-Scale AI with Cerebras Systems
RCCSRENKEI
 
ExaLearn Overview - ECP Co-Design Center for Machine Learning
inside-BigData.com
 
OpenACC Monthly Highlights Summer 2019
OpenACC
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
inside-BigData.com
 
The Coming Age of Extreme Heterogeneity in HPC
inside-BigData.com
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 
OpenACC Monthly Highlights: July 2021
OpenACC
 
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
Introducing the TPCx-HS Benchmark for Big Data
inside-BigData.com
 
ExtremeEarth Data Science Pipeline for Linked Earth Observation Data
ExtremeEarth
 
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
ExtremeEarth
 
OpenACC Monthly Highlights: June 2021
OpenACC
 
Geofizyka Krakow Selects Panasas for Simplicity and Performance
Panasas
 
OpenACC Monthly Highlights: January 2021
OpenACC
 
ARM HPC Ecosystem
inside-BigData.com
 

Viewers also liked (20)

PDF
RDMA on ARM
inside-BigData.com
 
PDF
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
inside-BigData.com
 
PDF
6 GigaSpaces Principles to Survive Black Friday
Ali Hodroj
 
PPTX
E-Commerce and In-Memory Computing: Crossing the Scalability Chasm
Ali Hodroj
 
PPTX
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Ali Hodroj
 
PPTX
Geo-Analytics with Apache Spark and In-Memory Data Grids
Ali Hodroj
 
PPTX
Spark DC Interactive Meetup: HTAP with Spark and In-Memory Data Grids
Ali Hodroj
 
PDF
Анализа на оддалечена експлоатациjа во Linux кернел
Zero Science Lab
 
PPTX
Application-level Disaster Recovery on OpenStack
Ali Hodroj
 
PPTX
Secrets of building a debuggable runtime: Learn how language implementors sol...
Dev_Events
 
PDF
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
Kento Aoyama
 
PPTX
Linux device drivers
Abhishek Sagar
 
PDF
Ceph Object Store
Daniel Schneller
 
PDF
TMPA-2017: Dl-Check: Dynamic Potential Deadlock Detection Tool for Java Programs
Iosif Itkin
 
PPTX
From Data to Insights to Action: When Transactions and Analytics Converge
Ali Hodroj
 
PDF
Disaster Recovery and Ceph Block Storage: Introducing Multi-Site Mirroring
Jason Dillaman
 
PDF
Intersect360 Top of All Things in HPC Snapshot Analysis
inside-BigData.com
 
PDF
Towards Exascale Computing with Fortran 2015
inside-BigData.com
 
PDF
State of Linux Containers for HPC
inside-BigData.com
 
PDF
Application Profiling at the HPCAC High Performance Center
inside-BigData.com
 
RDMA on ARM
inside-BigData.com
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
inside-BigData.com
 
6 GigaSpaces Principles to Survive Black Friday
Ali Hodroj
 
E-Commerce and In-Memory Computing: Crossing the Scalability Chasm
Ali Hodroj
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Ali Hodroj
 
Geo-Analytics with Apache Spark and In-Memory Data Grids
Ali Hodroj
 
Spark DC Interactive Meetup: HTAP with Spark and In-Memory Data Grids
Ali Hodroj
 
Анализа на оддалечена експлоатациjа во Linux кернел
Zero Science Lab
 
Application-level Disaster Recovery on OpenStack
Ali Hodroj
 
Secrets of building a debuggable runtime: Learn how language implementors sol...
Dev_Events
 
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
Kento Aoyama
 
Linux device drivers
Abhishek Sagar
 
Ceph Object Store
Daniel Schneller
 
TMPA-2017: Dl-Check: Dynamic Potential Deadlock Detection Tool for Java Programs
Iosif Itkin
 
From Data to Insights to Action: When Transactions and Analytics Converge
Ali Hodroj
 
Disaster Recovery and Ceph Block Storage: Introducing Multi-Site Mirroring
Jason Dillaman
 
Intersect360 Top of All Things in HPC Snapshot Analysis
inside-BigData.com
 
Towards Exascale Computing with Fortran 2015
inside-BigData.com
 
State of Linux Containers for HPC
inside-BigData.com
 
Application Profiling at the HPCAC High Performance Center
inside-BigData.com
 
Ad

Similar to Exascale Computing Project - Driving a HUGE Change in a Changing World (20)

PDF
The U.S. Exascale Computing Project: Status and Plans
inside-BigData.com
 
PDF
ECP Application Development
inside-BigData.com
 
PDF
CSC2013: Exascale in the US
John Towns
 
PDF
Programming Models for Exascale Systems
inside-BigData.com
 
PDF
Future of hpc
Putchong Uthayopas
 
PDF
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
inside-BigData.com
 
PDF
Nikravesh australia long_versionkeynote2012
Masoud Nikravesh
 
PPTX
EXASXALE COMPUTING
balakrishnan Bsk
 
PPTX
Sc10 slide share
Guy Tel-Zur
 
PDF
05 Preparing for Extreme Geterogeneity in HPC
RCCSRENKEI
 
PDF
Exascale Update from Hyperion Research
inside-BigData.com
 
PPTX
Technological forecasting of supercomputer development: The march to exascale...
dongjoon
 
PDF
Panda scalable hpc_bestpractices_tue100418
inside-BigData.com
 
PDF
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
inside-BigData.com
 
PDF
Nikravesh big datafeb2013bt
Masoud Nikravesh
 
PDF
Mauricio breteernitiz hpc-exascale-iscte
mbreternitz
 
PPTX
Supporting Research Communities with XSEDE
John Towns
 
PPTX
Introduction to heterogeneous_computing_for_hpc
Supasit Kajkamhaeng
 
PPTX
High-Performance Computing Research in Europe
Govnet Events
 
PPTX
Communication Frameworks for HPC and Big Data
inside-BigData.com
 
The U.S. Exascale Computing Project: Status and Plans
inside-BigData.com
 
ECP Application Development
inside-BigData.com
 
CSC2013: Exascale in the US
John Towns
 
Programming Models for Exascale Systems
inside-BigData.com
 
Future of hpc
Putchong Uthayopas
 
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
inside-BigData.com
 
Nikravesh australia long_versionkeynote2012
Masoud Nikravesh
 
EXASXALE COMPUTING
balakrishnan Bsk
 
Sc10 slide share
Guy Tel-Zur
 
05 Preparing for Extreme Geterogeneity in HPC
RCCSRENKEI
 
Exascale Update from Hyperion Research
inside-BigData.com
 
Technological forecasting of supercomputer development: The march to exascale...
dongjoon
 
Panda scalable hpc_bestpractices_tue100418
inside-BigData.com
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
inside-BigData.com
 
Nikravesh big datafeb2013bt
Masoud Nikravesh
 
Mauricio breteernitiz hpc-exascale-iscte
mbreternitz
 
Supporting Research Communities with XSEDE
John Towns
 
Introduction to heterogeneous_computing_for_hpc
Supasit Kajkamhaeng
 
High-Performance Computing Research in Europe
Govnet Events
 
Communication Frameworks for HPC and Big Data
inside-BigData.com
 
Ad

More from inside-BigData.com (20)

PDF
Major Market Shifts in IT
inside-BigData.com
 
PDF
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
PPTX
Transforming Private 5G Networks
inside-BigData.com
 
PDF
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
PDF
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com
 
PDF
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
inside-BigData.com
 
PDF
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
PDF
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
PDF
Machine Learning for Weather Forecasts
inside-BigData.com
 
PPTX
HPC AI Advisory Council Update
inside-BigData.com
 
PDF
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com
 
PDF
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
PDF
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
inside-BigData.com
 
PDF
State of ARM-based HPC
inside-BigData.com
 
PDF
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
PDF
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
inside-BigData.com
 
PDF
Scaling TCO in a Post Moore's Era
inside-BigData.com
 
PDF
CUDA-Python and RAPIDS for blazing fast scientific computing
inside-BigData.com
 
PDF
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
PDF
Overview of HPC Interconnects
inside-BigData.com
 
Major Market Shifts in IT
inside-BigData.com
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
inside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
inside-BigData.com
 
HPC AI Advisory Council Update
inside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
inside-BigData.com
 
State of ARM-based HPC
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
inside-BigData.com
 
Scaling TCO in a Post Moore's Era
inside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Overview of HPC Interconnects
inside-BigData.com
 

Recently uploaded (20)

PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 

Exascale Computing Project - Driving a HUGE Change in a Changing World

  • 2. 2 Exascale Computing Project, www.exascaleproject.org Leverages the planned DOE facilities acquisitions ◊ 2017 CORAL (collaboration of ORNL, ANL, LLNL) ◊ 2020 APEX (collaboration of LANL/SNL, NERSC) ◊ 2022 CORAL ◊ 2024 APEX Exascale Computing Project - Lift the entire HPC ecosystem and enable continued U.S. leadership in HPC Time (CY) Capability 2017 2021 2022 2023 2024 2025 2026 2027 10X 5X
  • 3. 3 Exascale Computing Project, www.exascaleproject.org Reaching the Elevated Trajectory will require solving key exascale challenges • Extreme Parallelism – For example, an Exaflop @ 1 GHz requires a billion threads executing • Memory and Storage – BW, latency, and capacity are not scaling with flops • Reliability – Energy saving techniques and number of components drive MTBF down • Energy Consumption – 20MW per Exaflop has been a target since 2009 In addition, the exascale advanced architecture will need to solve emerging data science and machine learning problems in addition to the traditional modeling and simulations applications.
  • 4. 4 Exascale Computing Project, www.exascaleproject.org Radical, Novel, Advanced solutions are not a Requirement but may be needed We want the vendors to propose what they see as being needed to meet performance, reliability, programmability, data science convergence, and power requirements. • If vendors can meet the requirements without needing new radical solutions that is fine and likely preferred. • If it involves radical new concepts, we are interested in hearing about these solutions. • We want to encourage vendors to propose new ideas where they provide a path for addressing our requirements but we don’t need novelty or “advancedness” just so we can claim things are “advanced”.
  • 5. 5 Exascale Computing Project, www.exascaleproject.org Goals of the Exascale Computing Project Develop scientific, engineering, and large- data applications that exploit the emerging, exascale-era computational trends caused by the end of Dennard scaling and Moore’s law Foster application development Create software that makes exascale systems usable by a wide variety of scientists and engineers across a range of applications Ease of use Enable exascale by 2021 and by 2023 at least two diverse computing platforms with up to 50× more computational capability than today’s 20 PF systems, within a similar size, cost, and power footprint Rich exascale ecosystem Help ensure continued U.S. leadership in architecture, software and applications to support scientific discovery, energy assurance, stockpile stewardship, and nonproliferation programs and policies US HPC leadership
  • 6. 6 Exascale Computing Project, www.exascaleproject.org The ECP Plan • Use a holistic/co-design approach across four focus areas: – Application Development – Software Technology – Hardware Technology – Exascale Systems • Enable an initial exascale system to be delivered in 2021 (power consumption and reliability requirements may be relaxed) • Enable capable exascale systems to be delivered in 2022 as part of the CORAL NNSA and SC facility upgrades • System acquisitions and costs are outside of the ECP plan, and will be carried out by DOE-SC and NNSA-ASC facilities
  • 7. 7 Exascale Computing Project, www.exascaleproject.org ECP Timeline has Two Phases – and ends 2022 R&D before facilities select exascale systems Targeted development for known exascale architectures 2016 2017 2018 2019 2020 2021 2022 2023 20252024FY 2026 Exascale System #1Site Prep #1 Testbeds Hardware Technology Software Technology Application Development Facilities activities outside ECP NRE System #1 NRE System #2 Exascale System #2Site Prep #2
  • 8. 8 Exascale Computing Project, www.exascaleproject.org What about the 2021 System? • The site of the 2021 system is TBD and will be decided by DOE around June 2017. • If the site is one of the CORAL labs, then the CORAL RFP will state: “Within the goal of having three capable exascale systems by 2022- 2023, if an early exascale system can be delivered in 2021 and upgraded to a capable exascale system by 2023, then provide the upgrade as an option.” • If the site of the 2021 system is outside the CORAL labs then (in addition to the CORAL RFP for three 2022 systems) a separate RFP for a single 2021 system will be released in 2018 by the chosen lab.
  • 9. 9 Exascale Computing Project, www.exascaleproject.org What is a capable exascale computing system? ECP defines a capable exascale system as a supercomputer that • Can solve science problems 50x faster (or more complex, for example, more physics, higher fidelity) than the 20 PF systems of today can solve comparable problems. • Must use a software stack that meets the needs of a broad spectrum of applications and workloads • Have a power envelope of 20-30 MW • Must be sufficiently resilient such that user intervention due to hardware or system faults is required on the order of a week.
  • 10. 10 Exascale Computing Project, www.exascaleproject.org Diversity is Very Important to DOE • In 2018 a single CORAL RFP will be released for delivery of three capable exascale systems by the 2022-2023 timeframe. The RFP will also include NRE for the systems. • These systems will have to be designed to solve emerging data science and machine learning problems in addition to the traditional modeling and simulations applications. • The DOE Leadership Computing Facility has a requirement that the ANL and ORNL systems must have diverse architectures. • Given the ECP goal of fostering a rich exascale ecosystem, LLNL has the option to choose a system that is diverse from both the ANL and ORNL systems.
  • 11. 11 Exascale Computing Project, www.exascaleproject.org There are Many Types of System Diversity • Systems can vary from one another in many different dimensions – System (architecture, interconnect, IO subsystem, density, resilience, etc.) – Node (heterogeneous, homogeneous, memory and processor architectures, etc.) – Software (HPC stack, OS, IO, file system, prog environment, admin tools, etc.) – Hardware e.g. • Ways Systems can be diverse – Few big differences – Many little differences – Different technologies – Different ecosystems, i.e., vendors involved technology scale type DDR NV PIM size DIMM Memory on die stacked Fat thin accel #cores homo Processor hetero topologies perf optical Network copper
  • 12. 12 Exascale Computing Project, www.exascaleproject.org How Diverse is Enough? How diverse is enough? There is no hard metric, Labs will evaluate diversity by how much it will benefit the exascale ecosystem Having system diversity provides many advantages. • It promotes price competition, which increases the value to DOE. • It promotes a competition of ideas and technologies, which helps provide more capable systems for DOE’s mission needs. • It reduces risk that may be caused by delays or failure of a particular technology or shifts in vendor business focus, staff or financial health. • It helps promote a rich and healthy high performance computing ecosystem, which is important for national competitiveness and DOE’s strategic plan.
  • 13. 13 Exascale Computing Project, www.exascaleproject.org The ECP holistic approach uses co-design and integration to achieve capable exascale Application Development Software Technology Hardware Technology Exascale Systems Scalable and productive software stack Science and mission applications Hardware technology elements Integrated exascale supercomputers Correctness Visualization Data Analysis Applications Co-Design Programming models, development environment, and runtimes Tools Math libraries and Frameworks System Software, resource management threading, scheduling, monitoring, and control Memory and Burst buffer Data management I/O and file system Node OS, runtimes Resilience Workflows Hardware interface Co-design centers Proxy apps Integration of NNSA and Office of Science SW efforts PathForward Design Space Evaluation Testbeds NRE
  • 14. 14 Exascale Computing Project, www.exascaleproject.org Application Scope determined by Mission Needs • Materials discovery and design • Climate science • Nuclear energy • Combustion science • Large-data applications • Fusion energy • National security • Additive manufacturing • Many others! • Stockpile Stewardship Annual Assessment and Significant Finding Investigations • Robust uncertainty quantification (UQ) techniques in support of lifetime extension programs • Understanding evolving nuclear threats posed by adversaries and in developing policies to mitigate these threats • Discover and characterize next-generation materials • Systematically understand and improve chemical processes • Analyze the extremely large datasets resulting from the next generation of particle physics experiments • Extract knowledge from systems- biology studies of the microbiome • Advance applied energy technologies (e.g., whole-device models of plasma-based fusion systems) Key science and technology challenges to be addressed with exascale Meet national security needs Support DOE science and energy missions
  • 15. 15 Exascale Computing Project, www.exascaleproject.org ECP Application Development – (1/3) Climate (BER) Accurate regional impact assessment of climate change* Combustion (BES) Design high- efficiency, low- emission combustion engines and gas turbines* Chemical Science (BES, BER) Biofuel catalysts design; stress- resistant crops Fundamental Laws (NP) QCD-based elucidation of fundamental laws of nature: Standard Model validation and beyond SM discoveries Materials Science (BES) Find, predict, and control materials and properties: Applications chosen based on National impact and DOE Offices priorities
  • 16. 16 Exascale Computing Project, www.exascaleproject.org ECP Application Development – (2/3) Genomics (BES) Protein structure and dynamics; 3D molecular structure design of engineering functional properties* Precision Medicine for Cancer (NIH) Accelerate and translate cancer research in RAS pathways, drug responses, and treatment strategies* Seismic (EERE, NE, NNSA) Reliable earthquake hazard and risk assessment in relevant frequency ranges* treaty verification   assembled within the limitations of shared memory hardware, in addition to making feasible the assembly                              of several thousand metagenomic samples of DOE relevance available at NCBI [40].    Figure 1: NCBI Short Read Archive (SRA) and                HipMer capability growth over time, based on rough                order­of­magnitude estimates for 1% annual compute            allocation (terabases, log scale).     Figure 2. Current (green area) and projected (pink                area) scale of metagenomics data and            exascale­enabled analysis.    Furthermore, the need for efficient and scalable de novo metagenome sequencing and analysis will only                              become greater as these datasets continue to grow both in volume and number, and will require exascale                                  level computational resources to handle the roughly doubling of metagenomic samples/experiments every                        Metagenomic (BER) Leveraging microbial diversity in metagenomic datasets for new products and life forms* Chemical Science (BES) Design catalysts for conversion of cellulosic- based chemicals into fuels, bioproducts Some applications also include a significant machine learning component *
  • 17. 17 Exascale Computing Project, www.exascaleproject.org ECP Applications Development – (3/3) * Scope includes a significant data science component Demystify origin of universe and nuclear matter in universe* Astrophysics (NP) Cosmology (HEP) Cosmological probe of standard model (SM) of particle physics: Inflation, dark matter, dark energy* Magnetic Fusion Energy (FES) Predict and guide stable ITER operational performance with an integrated whole device model* Nuclear Energy (NE) Accelerate design and commercialization of next-generation small modular reactors* Wind Energy (EERE) Increase efficiency and reduce cost of turbine wind plants sited in complex terrains* Some applications also include a significant data science component *
  • 18. 18 Exascale Computing Project, www.exascaleproject.org ECP Application Development Co-Design Centers • Center for Online Data Analysis and Reduction at the Exascale (CODAR) • Block-Structured AMR Co-Design Center (AMReX) • Center for Efficient Exascale Discretizations (CEED) • Co-Design Center for Particle Applications (CoPA) • Graph and Combinatorial Methods for Enabling Exascale Applications (GraphEx)
  • 19. 19 Exascale Computing Project, www.exascaleproject.org ECP Software Technology Summary • ECP will build a comprehensive and coherent software stack that will enable application developers to productively write highly parallel applications that can portably target diverse exascale architectures • ECP will accomplish this by extending current technologies to exascale where possible, performing R&D required to conceive of new approaches where necessary, coordinating with vendor efforts, and developing and deploying high-quality and robust software products
  • 20. 20 Exascale Computing Project, www.exascaleproject.org ECP Hardware Technology Summary Objective: Fund R&D to design hardware that meets ECP’s Targets for application performance, power efficiency, and resilience • Issue PathForward and PathForward-II Hardware Architecture R&D contracts • Participate in evaluation and review of PathForward and LeapForward deliverables • Lead Design Space Evaluation through Architectural Analysis, and Abstract Machine Models of PathForward/PathForward-II designs for ECP’s holistic co-design
  • 21. 21 Exascale Computing Project, www.exascaleproject.org Goals for PathForward (issued last year – 6 vendor awards pending) • Improve the quality and number of competitive offeror responses to the Capable Exascale Systems RFP • Improve the offeror’s confidence in the value and feasibility of aggressive advanced technology options that would be bid in response to the Capable Exascale Systems RFP • Improve DOE confidence in technology performance benefit, programmability and ability to integrate into a credible system platform acquisition
  • 22. 22 Exascale Computing Project, www.exascaleproject.org Goals of PathForward-II (planned for issue in 2017) • Support high payoff, innovative hardware technologies and systems technologies that may have higher risk. It is focused on component, node, and system architecture designs that will intersect with the 2021 exascale system. • Also of interest to the PathForward-II RFP team: – Innovations that may enable dramatic acceleration of certain applications, for example, delivering a 100x increase in 2021 on some classes of applications while still being able to solve the full range of DOE applications – Developments that promote wider diversity in the exascale ecosystem – Innovations in power consumption, performance, programmability, reliability, data science, machine learning, or portability – Reducing total cost of ownership
  • 23. 23 Exascale Computing Project, www.exascaleproject.org ECP Exascale Systems Summary • Funds Non-Recurring Engineering (NRE) – Brings to the product stage promising hardware and software research and integrates it into a system – Includes application readiness R&D efforts – Must start early enough to impact the system - more than two full years of lead time are necessary to maximize impact • Funds Testbeds – ECP ECP testbeds will be deployed each year throughout the project – FY17 testbeds will be acquired through options on existing contracts at Argonne and ORNL – Testbed architectures will track SC/NNSA system acquisitions and other promising architectures
  • 24. 24 Exascale Computing Project, www.exascaleproject.org This is a very exciting time for computing in the US • Unique opportunity to do something HUGE for the nation in HPC • The exascale systems in 2021 and 2022 afford the opportunity for – More rapid advancement and scaling of mission and science applications – More rapid advancement and scaling of an exascale software stack – Rapid investments in vendor technologies and software needed for 2021 and 2022 systems – More rapid progress in numerical methods and algorithms for advanced architectures – Strong leveraging of and broader engagement with US computing capability • When ECP ends, we will have – Prepared industry and critical applications for a more diverse and sophisticated set of computing technologies, carrying US supercomputing well into the future – Demonstrated integrated software stack components at exascale – Invested in the engineering and development, and participated in acquisition and testing of capable exascale systems