SlideShare a Scribd company logo
Secrets of Supercomputing The Conservation Laws Supercomputing Challenge Kickoff October 21-23, 2007 I. Background to Supercomputing II. Get Wet! With the Shallow Water Equations Bob Robey - Los Alamos National Laboratory Randy Roberts – Los Alamos National Laboratory Cleve Moler -- Mathworks LA-UR-07-6793 Approved for public release; distribution is unlimited
Introductions Bob Robey --  Los Alamos National Lab, X division [email_address] , 665-9052 or home:  [email_address] , 662-2018 3D Hydrocodes and parallel numerical software Helped found UNM and Maui High Performance Computing Centers and Supercomputing Tutorials Randy Roberts --  Los Alamos National Lab, D Division Java, C++, Numerical and Agent Based Modeling [email_address] Cleve Moler Matlab Founder Former UNM CS Dept Chair SIAM President Author of “Numerical Computing with Matlab” and “Experiments with Matlab”
Conservation Laws Formulated as a conserved quantity mass momentum energy Good reference is Leveque’s book and his freely available software package CLAWPACK (Fortran/MPI) and a 2D shallow water version Tsunamiclaw Leveque, Randall, Numerical Methods for Conservation Laws Leveque, Randall, Finite Volume Methods for Hyperbolic Problems CLAWPACK  http://www.amath.washington.edu/~claw/ Tsunamiclaw http://www.math.utah.edu/~george/tsunamiclaw.html Conserved variable Change
I. Intro to Supercomputing Classical Definition of Supercomputing Harnessing lots of processors to do lots of small calculations There are many other definitions which usually include any computing beyond the norm Includes new techniques in modeling, visualization, and higher level languages. Question for thought:  With greater CPU resources is it better to save programmer work or to make the computer do bigger problems?
II. Calculus Quickstart Decoding the Language of Wizards
Calculus Quickstart Goals Calculus is a language of mathematical wizards. It is convenient shorthand, but not easy to understand until you learn the secrets to the code. Our goal is for you to be able to  READ calculus and  TALK calculus. Goal is  not  to ANALYTICALLY SOLVE calculus using traditional methods. In supercomputing we generally solve problems by brute force.
Calculus Terminology Two branches of Calculus  Integral Calculus Derivative Calculus P = f(x, y, t) Population is a function of x, y, and t ∫ f(x)dx – definite integral, area under the curve, or summation dP/dx – derivative, instantaneous rate of change, or slope of a function ∂ P/∂x – partial derivative implying that P is a function of more than one variable
Matrix Notation The first set of terms are state variables at time t and usually called U. The second set of terms are the flux variables in space x and usually referred to as F. This is just a system of equations a + c = 0 b + d = 0 U F
Parallel Algorithms Data Parallel  -– most common with MPI Master/Worker  – one process hands out the work to the other processes – great load balance, good with threads Pipeline  – bucket brigade Implementation Patterns Message Passing Threads Shared Memory Distributed Arrays, Global Arrays Patterns for Parallel Programming Patterns for Parallel Programming, Mattson, Sanders, and Massingill, 2005
Writing a Program Data Parallel Model Serial operations are done on  every   processor so that  replicated data  is the same on every processor.  This may seem like a waste of work, but it is easier than synchronizing data values. Sections of  distributed data  are “owned” by each processor. This is where the parallel speedups occur. Often ghost cells around each processor’s data is a way to handle communication. P(400) – distributed Ptot -- replicated Proc 1 P(1-100) Ptot Proc 2 P(101-200) Ptot Proc 3 P(201–300) Ptot Proc 4 P(301-400) Ptot
2007-2008 Sample Supercomputing Project Evaluation Criteria – Expo (Report slightly different). Use these to evaluate the following project. 15% Problem Statement 25% Mathematical/Algorithmic Model 25% Computational Model 15% Results and Conclusions 10% Code 10% Display Evaluate Us!!
Get Wet!  With the Shallow Water Equations The shallow water model for wave motion is important for water flow, seashore waves, and flooding Goal of this project is to model the wave motion in the shallow water tank With slight modifications this model can be applied to: ocean or lake currents weather glacial movement
Output from a shallow water equation model of water in a bathtub. The water experiences 5 splashes which generate surface gravity waves that propagate away from the splash locations and reflect off of the bathtub walls. Wikipedia commons, Author  Dan  Copsey Go to shallow water movie.  http:// en.wikipedia.org/wiki/Image:Shallow_water_waves.gif
Mathematical Equations Mathematical Model Conservation of Mass Conservation of Momentum Shallow Water Equations Notes: mass equals height because width, depth and density are all constant h -> height u -> velocity g -> gravity References: Leveque, Randall, Finite Volume Methods for Hyperbolic Problems, p. 254 Note: Force term, Pressure P=½gh 2
Shallow Water Equations Matrix Notation The maximum time step is calculated so as to keep a wave from completely crossing a cell.
Numerical Model Lax-Wendroff two-step, a predictor-corrector method Predictor step estimates the values at the zone boundaries at half a time step advanced in time Corrector step fluxes the variables using the predictor step values Mathematical Notes for next slide: U is a state variable such as mass or height. F is a flux term – the velocity times the state variable at the interface superscripts are time subscripts are space
The Lax-Wendroff Method Half Step Whole Step Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge
Explanation of Lax-Wendroff Model Physical model Original Half-step Full step t i t+1 i t+.5 i+.5 Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge. See appendix for 2D index explanation. Ghost cell Data assumed to be at the center of cell. Space index
Extension to 2D The extension of the shallow water equations to 2D is shown in the following slides. First slide shows the matrix form of the 2D shallow water equations Second slide shows the 2D form of the Lax-Wendroff numerical method
2D Shallow Water Equations Note the addition of fluxes in the y direction and a flux cross term in the momentum equation. The U, F, and G are shorthand for the numerical equations on the next slide. The U terms are the state variables. F and G are the flux terms in x and y. U F G
The Lax-Wendroff Method Half Step Whole Step
2D Shallow Water Equations Transformed for Programming Letting H = h, U = hu and V = hv so that our main variables are the state variables in the first column gives the following set of equations. H is height (same as mass for constant width, depth and density)  U is x momentum (x velocity times mass) V is y momentum (y velocity times mass)
Sample Programs The numerical method was extracted from the McCurdy team’s model (team 62) from last year and reprogrammed from serial Fortran to C/MPI using the programming style from one of the Los Alamos team’s project (team 51) with permission from both teams. Additional versions of the program were made in Java/Threads and Matlab
Programming Tools Three options Matlab Computation and graphics integrated into Matlab desktop Java/Threads Eclipse or Netbeans workbench Graphics via Java 2D and Java Free Chart C/MPI Eclipse workbench -- An open-source Programmers Workbench  http:// www.eclipse.org . PTP (parallel tools plug-in) – adds MPI support to Eclipse (developed partly at LANL) OpenMPI – a MPI implementation (developed partly at LANL) MPE -- graphics calls that come with MPICH. Graphics calls are done in parallel from each processor!
Initial Conditions and Boundary Conditions Initial conditions velocity (u and v) are 0 throughout the mesh height is 2 with a ramp to the height of 10 at the right hand boundary starting at the mid-point in the x dimension Boundary conditions are reflective, slip h bound =h interior ; u xbound =0; v xbound =v interior h bound =h interior ; u ybound =u interior ; v ybound =0 If using ghost cells, force zero velocity at the boundary by setting U xghost = -U interior
Results/Conclusions The Lax-Wendroff model accurately models the experimental wave tank matches wave speed across the tank Some of the oscillations in the simulation are an artifact of the numerical model OK as long as initial wave is not too steep numerical damping technique could be added but is beyond the scope of this effort
Acknowledgements Work used by permission: Awash: Modeling Wave Movement in a Ripple Tank, Team 62, McCurdy High School, 2006-2007 Supercomputing Challenges A Lot of Hot Air: Modeling Compressible Fluid Dynamics, Team 51, Los Alamos High School, 2006-2007 Supercomputing Challenge We all have bugs and thanks to those who found mine Randy Roberts and Jon Robey for finding and fixing a bug in the second pass Randy Leveque for finding a missing square in the gravity forcing term
Lab Exercises TsunamiClaw Matlab  Experimental demonstration Java Serial Java Parallel C/MPI
Java Wave Structure Wave class does most of the work main(String[] args) calls start() ‏ start() creates a WaveProblemSetup start() calls methods to do initialization and boundary conditions start() calls methods to iterate and update the display
Java Wave Structure (continued) ‏ WaveProblemSetup stores the new and old arrays swaps the new and old arrays when asked to by Wave
Java Wave Program Flow Create arrays for new, old, and temporary data Initialize data Set boundary data to represent correct boundary conditions Iterate for the given number of iterations
Java Wave Iteration Flow Update physics into new arrays from data in old arrays Set boundary data to represent correct boundary conditions with updated arrays Update display Swap new arrays with old arrays
Java Threads How do you take advantage of new Multi-Core processors? Run parts of the problem on different cores at the same time!
Java Threads (continued) ‏ WaveThreaded program partitions the problem into domains using SubWaveProblemSetup objects runs calculations on each domain in separate threads using WaveWorker objects adds complexity with synchronization of thread's access to data
C/MPI Program Diagram Update Boundary Cells MPI Communication External Boundaries First Pass x half step y half step Second Pass Swap new/old Graphics Output Conservation Check Calculate Runtime Close Display, MPI & exit Allocate memory Set Initial Conditions Initial Display Repeat
MPI Quick Start #include <mpi.h> MPI_Init(&argc, &argv) MPI_Comm_size(Comm, &nprocs) // get number of processors MPI_Comm_rank(Comm, &myrank) // get processor rank 0 to nproc-1 // Broadcast from source processor to all processors MPI_Bcast(buffer, count, MPI_type, source, Comm)  // Used to update ghost cells MPI_ISend(buffer, count, MPI_type, dest, tag, Comm, req) MPI_IRecv(buffer, count, MPI_type, source, tag, Comm, req+1) MPI_Waitall(num, req, status) // Used for sum, max, and min such as total mass or minimum timestep MPI_Allreduce(&num_local, &num_global, count, MPI_type, MPI_op, Comm) MPI_Finalize() Web pages for MPI and MPE at Argonne National Lab (ANL) --  http://www- unix.mcs.anl.gov/mpi/www /
Setup The software is already setup on the computers For setup on home computers, there are two parts. First download the files from the Supercomputing Challenge website for the lab in C/MPI if you haven’t already done that. Untar the lab files with  “ tar –xzvf Wave_Lab.tgz”
Setting up Software Instructions in the README file Setting up System Software Need Java, OpenMPI and MPE package from MPICH Download and install according to instructions in openmpi_setup.sh Can install in user’s directory with some modifications Setting up User’s workspace Download eclipse software including eclipse, PTP and PLDT Install according to instructions in eclipse_setup.sh Import wave source files and setup eclipse according to instructions in eclipse_setup.sh
Lab Exercises Try modifying the sample program (Java and/or C versions) Change initial distribution. How sharp can it be before it goes unstable? Change number of cells Change graphics output Try running 1, 2, or 4 processes and time the runs. Note that you can run 4 processes even if you are on a one processor system. Switch to PTP debug or Java debug perspective and try stepping through the program Comparing to data is critical Are there other unrealistic behaviors of the model? Design an experiment to isolate variable effects. This can greatly improve your model.
Appendix A. Calculus and Supercomputing Calculus and Supercomputing are intertwined. Why?  Here is a simple problem – Add up the volume of earth above sea-level for an island 500 ft high by half a mile wide and twenty miles long. Typical science homework problem using simple algebra. Can be done by hand. Not appropriate for supercomputing. Not enough complexity.
Add Complexity The island profile is a jagged mountainous terrain cut by deep canyons. How do we add up the volume? Calculus – language of complexity Addition – summing numbers Multiplication – summing numbers with a constant magnitude Integration  – summing numbers with an irregular magnitude
Divide and Conquer In discrete form Divide the island into small pieces and sum up the volume of each piece. Approaches the solution as the size of the intervals grows smaller for a jagged profile. ∑  -- Summation symbol ∆  -- delta symbol or x 2 -x 1
Divide and Conquer In Continuous Form – Integration Think of the integral symbols as describing a shape that is continuously varying  The accuracy of the solution can be improved by summing over smaller increments Lots of arithmetic operations – now you have a “computing” problem. Add more work and you have a “supercomputing” problem.
Derivative Calculus Describing Change Derivatives describe the change in a variable (numerator or top variable) relative to another variable (denominator or bottom). These three derivatives describe the change in population versus time, x-direction and y-direction.
Appendix B. Computational Methods Eulerian and Lagrangian Explicit and Implicit
Two Main Approaches to Divide up Problem Eulerian – divide up by spatial coordinates Track populations in a location Observer frame of reference Lagrangian – divide up by objects Object frame of reference Easier to track attributes of population since they travel with the objects Agent based modeling of Star Logo uses this approach Can tangle mesh in 2 and 3 dimensions
Eulerian Eulerian – The area stays fixed and has a Population per area. We observe the change in population across the boundaries of the area. Lagrangian – The population stays constant. The population moves with velocity vx and vy and we move with them. The size of the area will change if the four vertexes of the rectangle move at different velocities. Changes in area will result in different densities.
Explicit versus Implicit Explicit – In mathematical shorthand, U n+1 = f(U n ). This means that the next timestep values can be expressed entirely on the previous timestep values. Implicit – U n+1 =f(U n+1 ,U n ). Next timestep values must be solved iteratively. Often uses a matrix or iterative solver. We will stick with explicit methods here. You need more math to attempt implicit methods.
Appendix C Index Explanation for 2D Lax Wendroff
Programming Most difficult part of programming this method is to keep track of indices – half step grid indices cannot be represented by ½ in the code so they have to be offset one way or the other. Errors are very difficult to find so it is important to be very methodical in the coding. Next two slides show the different sizes of the staggered half-step grid and the relationships between the indices in the calculation (courtesy Jon Robey).
0,0 --  1,0 | 1,1 j,i  --  j+1,i | j+1,i+1 0,0 --  0,1 | 1,1 j,i  --  j,i+1 | j+1,i+1 1 st  Pass y y y y y y y y y y y y x x x x x x x x x x x x 0 1 2 3 4 j 0 1 2 3 4 i X step grid Main grid Y step grid Main grid
1,1 1,1 --  0,0 | 1,0 --  0,0 | 0,1 j,i  --  j-1,i-1 | j,i-1 j,i  --  j-1,i-1 | j-1,i 2 nd  Pass y y y y y y y y y y y y x x x x x x x x x x x x 0 1 2 3 4 j 0 1 2 3 4 i Main grid X step grid Main grid Y step grid

More Related Content

What's hot (6)

PDF
Recurrent and Recursive Nets (part 2)
sohaib_alam
 
PDF
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Alexis Perrier
 
PDF
Recurrent and Recursive Networks (Part 1)
sohaib_alam
 
PPT
Chap4 slides
Jothish DL
 
ODP
Chapter - 04 Basic Communication Operation
Nifras Ismail
 
PPTX
Scalable Parallel Computing on Clouds
Thilina Gunarathne
 
Recurrent and Recursive Nets (part 2)
sohaib_alam
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Alexis Perrier
 
Recurrent and Recursive Networks (Part 1)
sohaib_alam
 
Chap4 slides
Jothish DL
 
Chapter - 04 Basic Communication Operation
Nifras Ismail
 
Scalable Parallel Computing on Clouds
Thilina Gunarathne
 

Similar to Secrets of supercomputing (20)

PDF
My Postdoctoral Research
Po-Ting Wu
 
PDF
VASPTutorial_2014 old vasp tut download
drmuditdixit
 
PPT
Secrets of Supercomputing
Marcus Vannini
 
PPT
Finite DIfference Methods Mathematica
guest56708a
 
DOCX
KMAP PAPER (1)
Aleksey Levkovskyi
 
PDF
Ijciet 08 02_025
IAEME Publication
 
PDF
COMPARISON OF LABVIEW WITH SAP2000 AND NONLIN FOR STRUCTURAL DYNAMICS PROBLEMS
IAEME Publication
 
PDF
Approaches to online quantile estimation
Data Con LA
 
PPTX
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
PDF
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 
PPTX
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
PPTX
Online learning, Vowpal Wabbit and Hadoop
Héloïse Nonne
 
PDF
Laboratory 7
Shafiul Omam
 
PPTX
design and analysis of algorithm (Longest common subsequence)
RoneekPatel
 
PDF
Automatic and Interpretable Machine Learning with H2O and LIME
Jo-fai Chow
 
PDF
Highly Parallel Pipelined VLSI Implementation of Lifting Based 2D Discrete Wa...
idescitation
 
PDF
(Ebook) Parallel MATLAB for Multicore and Multinode Computers by Jeremy Kepne...
ckicmoraia
 
PDF
Crypto
Richard Ashworth
 
PDF
Communications In Mathematical Physics Volume 281 M Aizenman Chief Editor
cxyeahcur755
 
PDF
Exascale Computing for Autonomous Driving
Levent Gürel
 
My Postdoctoral Research
Po-Ting Wu
 
VASPTutorial_2014 old vasp tut download
drmuditdixit
 
Secrets of Supercomputing
Marcus Vannini
 
Finite DIfference Methods Mathematica
guest56708a
 
KMAP PAPER (1)
Aleksey Levkovskyi
 
Ijciet 08 02_025
IAEME Publication
 
COMPARISON OF LABVIEW WITH SAP2000 AND NONLIN FOR STRUCTURAL DYNAMICS PROBLEMS
IAEME Publication
 
Approaches to online quantile estimation
Data Con LA
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
Online learning, Vowpal Wabbit and Hadoop
Héloïse Nonne
 
Laboratory 7
Shafiul Omam
 
design and analysis of algorithm (Longest common subsequence)
RoneekPatel
 
Automatic and Interpretable Machine Learning with H2O and LIME
Jo-fai Chow
 
Highly Parallel Pipelined VLSI Implementation of Lifting Based 2D Discrete Wa...
idescitation
 
(Ebook) Parallel MATLAB for Multicore and Multinode Computers by Jeremy Kepne...
ckicmoraia
 
Communications In Mathematical Physics Volume 281 M Aizenman Chief Editor
cxyeahcur755
 
Exascale Computing for Autonomous Driving
Levent Gürel
 
Ad

More from fikrul islamy (20)

PDF
Python arch wiki
fikrul islamy
 
PDF
Module net cdf4
fikrul islamy
 
DOCX
Akar persamaan2 metnum
fikrul islamy
 
PPT
sedimen transport
fikrul islamy
 
PPT
Marine mammals
fikrul islamy
 
PDF
Convert an auto cad file to a shapefile and georeferencing
fikrul islamy
 
DOC
Kemas & eclogite #GEOLOGI
fikrul islamy
 
PDF
PERMODELAN TSUNAMI UNTUK PENENTUAN ZONA MITIGASI DAN ANALISIS DAMPAK TERHADAP...
fikrul islamy
 
PDF
Prospectus FPIK Brawijaya university (concept 2012)
fikrul islamy
 
PDF
Lirik & chord lagu mix 1
fikrul islamy
 
PDF
Lirik & chord lagu mix 3
fikrul islamy
 
DOCX
Koreksi geometrik peta (arc gis) registrasi
fikrul islamy
 
DOC
Teknologi gis dan analisis spasial di zona pesisir manajemen
fikrul islamy
 
PPT
Secrets of supercomputing
fikrul islamy
 
PDF
Quali tas movie
fikrul islamy
 
PDF
Pendekatan unt-membangun-sistem
fikrul islamy
 
DOCX
Koreksi geometrik peta (arc gis) registrasi
fikrul islamy
 
PDF
Bangun datar dan bangun datar
fikrul islamy
 
DOC
Pengolahan sst satelit modis
fikrul islamy
 
Python arch wiki
fikrul islamy
 
Module net cdf4
fikrul islamy
 
Akar persamaan2 metnum
fikrul islamy
 
sedimen transport
fikrul islamy
 
Marine mammals
fikrul islamy
 
Convert an auto cad file to a shapefile and georeferencing
fikrul islamy
 
Kemas & eclogite #GEOLOGI
fikrul islamy
 
PERMODELAN TSUNAMI UNTUK PENENTUAN ZONA MITIGASI DAN ANALISIS DAMPAK TERHADAP...
fikrul islamy
 
Prospectus FPIK Brawijaya university (concept 2012)
fikrul islamy
 
Lirik & chord lagu mix 1
fikrul islamy
 
Lirik & chord lagu mix 3
fikrul islamy
 
Koreksi geometrik peta (arc gis) registrasi
fikrul islamy
 
Teknologi gis dan analisis spasial di zona pesisir manajemen
fikrul islamy
 
Secrets of supercomputing
fikrul islamy
 
Quali tas movie
fikrul islamy
 
Pendekatan unt-membangun-sistem
fikrul islamy
 
Koreksi geometrik peta (arc gis) registrasi
fikrul islamy
 
Bangun datar dan bangun datar
fikrul islamy
 
Pengolahan sst satelit modis
fikrul islamy
 
Ad

Recently uploaded (20)

PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PDF
The Different Types of Non-Experimental Research
Thelma Villaflores
 
PDF
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
PPSX
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PDF
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
PDF
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
PDF
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
PDF
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
PDF
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
PPTX
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
PPTX
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
PDF
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
The Different Types of Non-Experimental Research
Thelma Villaflores
 
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 

Secrets of supercomputing

  • 1. Secrets of Supercomputing The Conservation Laws Supercomputing Challenge Kickoff October 21-23, 2007 I. Background to Supercomputing II. Get Wet! With the Shallow Water Equations Bob Robey - Los Alamos National Laboratory Randy Roberts – Los Alamos National Laboratory Cleve Moler -- Mathworks LA-UR-07-6793 Approved for public release; distribution is unlimited
  • 2. Introductions Bob Robey -- Los Alamos National Lab, X division [email_address] , 665-9052 or home: [email_address] , 662-2018 3D Hydrocodes and parallel numerical software Helped found UNM and Maui High Performance Computing Centers and Supercomputing Tutorials Randy Roberts -- Los Alamos National Lab, D Division Java, C++, Numerical and Agent Based Modeling [email_address] Cleve Moler Matlab Founder Former UNM CS Dept Chair SIAM President Author of “Numerical Computing with Matlab” and “Experiments with Matlab”
  • 3. Conservation Laws Formulated as a conserved quantity mass momentum energy Good reference is Leveque’s book and his freely available software package CLAWPACK (Fortran/MPI) and a 2D shallow water version Tsunamiclaw Leveque, Randall, Numerical Methods for Conservation Laws Leveque, Randall, Finite Volume Methods for Hyperbolic Problems CLAWPACK http://www.amath.washington.edu/~claw/ Tsunamiclaw http://www.math.utah.edu/~george/tsunamiclaw.html Conserved variable Change
  • 4. I. Intro to Supercomputing Classical Definition of Supercomputing Harnessing lots of processors to do lots of small calculations There are many other definitions which usually include any computing beyond the norm Includes new techniques in modeling, visualization, and higher level languages. Question for thought: With greater CPU resources is it better to save programmer work or to make the computer do bigger problems?
  • 5. II. Calculus Quickstart Decoding the Language of Wizards
  • 6. Calculus Quickstart Goals Calculus is a language of mathematical wizards. It is convenient shorthand, but not easy to understand until you learn the secrets to the code. Our goal is for you to be able to READ calculus and TALK calculus. Goal is not to ANALYTICALLY SOLVE calculus using traditional methods. In supercomputing we generally solve problems by brute force.
  • 7. Calculus Terminology Two branches of Calculus Integral Calculus Derivative Calculus P = f(x, y, t) Population is a function of x, y, and t ∫ f(x)dx – definite integral, area under the curve, or summation dP/dx – derivative, instantaneous rate of change, or slope of a function ∂ P/∂x – partial derivative implying that P is a function of more than one variable
  • 8. Matrix Notation The first set of terms are state variables at time t and usually called U. The second set of terms are the flux variables in space x and usually referred to as F. This is just a system of equations a + c = 0 b + d = 0 U F
  • 9. Parallel Algorithms Data Parallel -– most common with MPI Master/Worker – one process hands out the work to the other processes – great load balance, good with threads Pipeline – bucket brigade Implementation Patterns Message Passing Threads Shared Memory Distributed Arrays, Global Arrays Patterns for Parallel Programming Patterns for Parallel Programming, Mattson, Sanders, and Massingill, 2005
  • 10. Writing a Program Data Parallel Model Serial operations are done on every processor so that replicated data is the same on every processor. This may seem like a waste of work, but it is easier than synchronizing data values. Sections of distributed data are “owned” by each processor. This is where the parallel speedups occur. Often ghost cells around each processor’s data is a way to handle communication. P(400) – distributed Ptot -- replicated Proc 1 P(1-100) Ptot Proc 2 P(101-200) Ptot Proc 3 P(201–300) Ptot Proc 4 P(301-400) Ptot
  • 11. 2007-2008 Sample Supercomputing Project Evaluation Criteria – Expo (Report slightly different). Use these to evaluate the following project. 15% Problem Statement 25% Mathematical/Algorithmic Model 25% Computational Model 15% Results and Conclusions 10% Code 10% Display Evaluate Us!!
  • 12. Get Wet! With the Shallow Water Equations The shallow water model for wave motion is important for water flow, seashore waves, and flooding Goal of this project is to model the wave motion in the shallow water tank With slight modifications this model can be applied to: ocean or lake currents weather glacial movement
  • 13. Output from a shallow water equation model of water in a bathtub. The water experiences 5 splashes which generate surface gravity waves that propagate away from the splash locations and reflect off of the bathtub walls. Wikipedia commons, Author Dan Copsey Go to shallow water movie. http:// en.wikipedia.org/wiki/Image:Shallow_water_waves.gif
  • 14. Mathematical Equations Mathematical Model Conservation of Mass Conservation of Momentum Shallow Water Equations Notes: mass equals height because width, depth and density are all constant h -> height u -> velocity g -> gravity References: Leveque, Randall, Finite Volume Methods for Hyperbolic Problems, p. 254 Note: Force term, Pressure P=½gh 2
  • 15. Shallow Water Equations Matrix Notation The maximum time step is calculated so as to keep a wave from completely crossing a cell.
  • 16. Numerical Model Lax-Wendroff two-step, a predictor-corrector method Predictor step estimates the values at the zone boundaries at half a time step advanced in time Corrector step fluxes the variables using the predictor step values Mathematical Notes for next slide: U is a state variable such as mass or height. F is a flux term – the velocity times the state variable at the interface superscripts are time subscripts are space
  • 17. The Lax-Wendroff Method Half Step Whole Step Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge
  • 18. Explanation of Lax-Wendroff Model Physical model Original Half-step Full step t i t+1 i t+.5 i+.5 Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge. See appendix for 2D index explanation. Ghost cell Data assumed to be at the center of cell. Space index
  • 19. Extension to 2D The extension of the shallow water equations to 2D is shown in the following slides. First slide shows the matrix form of the 2D shallow water equations Second slide shows the 2D form of the Lax-Wendroff numerical method
  • 20. 2D Shallow Water Equations Note the addition of fluxes in the y direction and a flux cross term in the momentum equation. The U, F, and G are shorthand for the numerical equations on the next slide. The U terms are the state variables. F and G are the flux terms in x and y. U F G
  • 21. The Lax-Wendroff Method Half Step Whole Step
  • 22. 2D Shallow Water Equations Transformed for Programming Letting H = h, U = hu and V = hv so that our main variables are the state variables in the first column gives the following set of equations. H is height (same as mass for constant width, depth and density) U is x momentum (x velocity times mass) V is y momentum (y velocity times mass)
  • 23. Sample Programs The numerical method was extracted from the McCurdy team’s model (team 62) from last year and reprogrammed from serial Fortran to C/MPI using the programming style from one of the Los Alamos team’s project (team 51) with permission from both teams. Additional versions of the program were made in Java/Threads and Matlab
  • 24. Programming Tools Three options Matlab Computation and graphics integrated into Matlab desktop Java/Threads Eclipse or Netbeans workbench Graphics via Java 2D and Java Free Chart C/MPI Eclipse workbench -- An open-source Programmers Workbench http:// www.eclipse.org . PTP (parallel tools plug-in) – adds MPI support to Eclipse (developed partly at LANL) OpenMPI – a MPI implementation (developed partly at LANL) MPE -- graphics calls that come with MPICH. Graphics calls are done in parallel from each processor!
  • 25. Initial Conditions and Boundary Conditions Initial conditions velocity (u and v) are 0 throughout the mesh height is 2 with a ramp to the height of 10 at the right hand boundary starting at the mid-point in the x dimension Boundary conditions are reflective, slip h bound =h interior ; u xbound =0; v xbound =v interior h bound =h interior ; u ybound =u interior ; v ybound =0 If using ghost cells, force zero velocity at the boundary by setting U xghost = -U interior
  • 26. Results/Conclusions The Lax-Wendroff model accurately models the experimental wave tank matches wave speed across the tank Some of the oscillations in the simulation are an artifact of the numerical model OK as long as initial wave is not too steep numerical damping technique could be added but is beyond the scope of this effort
  • 27. Acknowledgements Work used by permission: Awash: Modeling Wave Movement in a Ripple Tank, Team 62, McCurdy High School, 2006-2007 Supercomputing Challenges A Lot of Hot Air: Modeling Compressible Fluid Dynamics, Team 51, Los Alamos High School, 2006-2007 Supercomputing Challenge We all have bugs and thanks to those who found mine Randy Roberts and Jon Robey for finding and fixing a bug in the second pass Randy Leveque for finding a missing square in the gravity forcing term
  • 28. Lab Exercises TsunamiClaw Matlab Experimental demonstration Java Serial Java Parallel C/MPI
  • 29. Java Wave Structure Wave class does most of the work main(String[] args) calls start() ‏ start() creates a WaveProblemSetup start() calls methods to do initialization and boundary conditions start() calls methods to iterate and update the display
  • 30. Java Wave Structure (continued) ‏ WaveProblemSetup stores the new and old arrays swaps the new and old arrays when asked to by Wave
  • 31. Java Wave Program Flow Create arrays for new, old, and temporary data Initialize data Set boundary data to represent correct boundary conditions Iterate for the given number of iterations
  • 32. Java Wave Iteration Flow Update physics into new arrays from data in old arrays Set boundary data to represent correct boundary conditions with updated arrays Update display Swap new arrays with old arrays
  • 33. Java Threads How do you take advantage of new Multi-Core processors? Run parts of the problem on different cores at the same time!
  • 34. Java Threads (continued) ‏ WaveThreaded program partitions the problem into domains using SubWaveProblemSetup objects runs calculations on each domain in separate threads using WaveWorker objects adds complexity with synchronization of thread's access to data
  • 35. C/MPI Program Diagram Update Boundary Cells MPI Communication External Boundaries First Pass x half step y half step Second Pass Swap new/old Graphics Output Conservation Check Calculate Runtime Close Display, MPI & exit Allocate memory Set Initial Conditions Initial Display Repeat
  • 36. MPI Quick Start #include <mpi.h> MPI_Init(&argc, &argv) MPI_Comm_size(Comm, &nprocs) // get number of processors MPI_Comm_rank(Comm, &myrank) // get processor rank 0 to nproc-1 // Broadcast from source processor to all processors MPI_Bcast(buffer, count, MPI_type, source, Comm) // Used to update ghost cells MPI_ISend(buffer, count, MPI_type, dest, tag, Comm, req) MPI_IRecv(buffer, count, MPI_type, source, tag, Comm, req+1) MPI_Waitall(num, req, status) // Used for sum, max, and min such as total mass or minimum timestep MPI_Allreduce(&num_local, &num_global, count, MPI_type, MPI_op, Comm) MPI_Finalize() Web pages for MPI and MPE at Argonne National Lab (ANL) -- http://www- unix.mcs.anl.gov/mpi/www /
  • 37. Setup The software is already setup on the computers For setup on home computers, there are two parts. First download the files from the Supercomputing Challenge website for the lab in C/MPI if you haven’t already done that. Untar the lab files with “ tar –xzvf Wave_Lab.tgz”
  • 38. Setting up Software Instructions in the README file Setting up System Software Need Java, OpenMPI and MPE package from MPICH Download and install according to instructions in openmpi_setup.sh Can install in user’s directory with some modifications Setting up User’s workspace Download eclipse software including eclipse, PTP and PLDT Install according to instructions in eclipse_setup.sh Import wave source files and setup eclipse according to instructions in eclipse_setup.sh
  • 39. Lab Exercises Try modifying the sample program (Java and/or C versions) Change initial distribution. How sharp can it be before it goes unstable? Change number of cells Change graphics output Try running 1, 2, or 4 processes and time the runs. Note that you can run 4 processes even if you are on a one processor system. Switch to PTP debug or Java debug perspective and try stepping through the program Comparing to data is critical Are there other unrealistic behaviors of the model? Design an experiment to isolate variable effects. This can greatly improve your model.
  • 40. Appendix A. Calculus and Supercomputing Calculus and Supercomputing are intertwined. Why? Here is a simple problem – Add up the volume of earth above sea-level for an island 500 ft high by half a mile wide and twenty miles long. Typical science homework problem using simple algebra. Can be done by hand. Not appropriate for supercomputing. Not enough complexity.
  • 41. Add Complexity The island profile is a jagged mountainous terrain cut by deep canyons. How do we add up the volume? Calculus – language of complexity Addition – summing numbers Multiplication – summing numbers with a constant magnitude Integration – summing numbers with an irregular magnitude
  • 42. Divide and Conquer In discrete form Divide the island into small pieces and sum up the volume of each piece. Approaches the solution as the size of the intervals grows smaller for a jagged profile. ∑ -- Summation symbol ∆ -- delta symbol or x 2 -x 1
  • 43. Divide and Conquer In Continuous Form – Integration Think of the integral symbols as describing a shape that is continuously varying The accuracy of the solution can be improved by summing over smaller increments Lots of arithmetic operations – now you have a “computing” problem. Add more work and you have a “supercomputing” problem.
  • 44. Derivative Calculus Describing Change Derivatives describe the change in a variable (numerator or top variable) relative to another variable (denominator or bottom). These three derivatives describe the change in population versus time, x-direction and y-direction.
  • 45. Appendix B. Computational Methods Eulerian and Lagrangian Explicit and Implicit
  • 46. Two Main Approaches to Divide up Problem Eulerian – divide up by spatial coordinates Track populations in a location Observer frame of reference Lagrangian – divide up by objects Object frame of reference Easier to track attributes of population since they travel with the objects Agent based modeling of Star Logo uses this approach Can tangle mesh in 2 and 3 dimensions
  • 47. Eulerian Eulerian – The area stays fixed and has a Population per area. We observe the change in population across the boundaries of the area. Lagrangian – The population stays constant. The population moves with velocity vx and vy and we move with them. The size of the area will change if the four vertexes of the rectangle move at different velocities. Changes in area will result in different densities.
  • 48. Explicit versus Implicit Explicit – In mathematical shorthand, U n+1 = f(U n ). This means that the next timestep values can be expressed entirely on the previous timestep values. Implicit – U n+1 =f(U n+1 ,U n ). Next timestep values must be solved iteratively. Often uses a matrix or iterative solver. We will stick with explicit methods here. You need more math to attempt implicit methods.
  • 49. Appendix C Index Explanation for 2D Lax Wendroff
  • 50. Programming Most difficult part of programming this method is to keep track of indices – half step grid indices cannot be represented by ½ in the code so they have to be offset one way or the other. Errors are very difficult to find so it is important to be very methodical in the coding. Next two slides show the different sizes of the staggered half-step grid and the relationships between the indices in the calculation (courtesy Jon Robey).
  • 51. 0,0 -- 1,0 | 1,1 j,i -- j+1,i | j+1,i+1 0,0 -- 0,1 | 1,1 j,i -- j,i+1 | j+1,i+1 1 st Pass y y y y y y y y y y y y x x x x x x x x x x x x 0 1 2 3 4 j 0 1 2 3 4 i X step grid Main grid Y step grid Main grid
  • 52. 1,1 1,1 -- 0,0 | 1,0 -- 0,0 | 0,1 j,i -- j-1,i-1 | j,i-1 j,i -- j-1,i-1 | j-1,i 2 nd Pass y y y y y y y y y y y y x x x x x x x x x x x x 0 1 2 3 4 j 0 1 2 3 4 i Main grid X step grid Main grid Y step grid