Introduction to LAVA workload scheduler High Performance Computing and Networking Center (HPCNC) Kasetsart University   In collaboration with Innovative Extremist (INOX) Co.,Ltd. and Platform Computing Inc.
Outline Introduction to HPC, cluster and workload scheduler
LAVA workload scheduler
Installing and configuring LAVA Cluster
Workshop : Using LAVA
Introduction to HPC, Cluster and  Workload Scheduler
Cluster Computing Cluster computing is a technology related to the building of high performance scalable computing system from a collection of small computing system and high speed interconnection network
Why now? Maturity of many enabling technologies Low to medium cost high speed network  Gigabit Ethernet, Myrinet, InfiniBand Powerful operating systems such as Windows, Linux, UNIX
Parallel Programming Systems which is portable and efficient
MPI (LAM, MPICH)
Software library that ease the application development e.g. Scalapack, Plapack, PetSc
PC also rule the world Impact of PC technology Intel Pentium can deliver supercomputing performance at low cost
PC mass market nature drive the price down while performance increase rapidly
Cluster nature make it easy to capitalize on PC new technology right away
Why? Price Performance!
Goal of Clustering High-performance clustering Link many computers together to team up and finish problem fasters by having multiple computer working on the same problem independently
Goal of Clustering High-availability clustering make more reliable computer system by having many computers working together and takeover when any of them fail
Applications Scientific computing CAD/CAM
Bioinformatics
Large scale financial analysis
Simulation
Drug Design
Automobile Design ( Crash Simulation) IT infrastructure Scalable web server, Search engine
(Google use more than 10000 node servers)  Entertainment Rendering  -  On-line Gaming
Molecular  Dynamic Simulation Drug Discovery using molecular docking Avian Flu
HIV Analyzing property of Chemical compound
Graphics Rendering and Special Effect Rendering Generating 3D image from model Problem   Rendering is a time consuming process especially for complex and realistic scene
Massive number of rendering job needed to be done to create a movie
Cluster Software Architecture HTC HPC HPTC
High Throughput Computing High throughout, not high performance Complete most number of jobs in shortest amount of time Serial, parametric (usually), non-parallelized code Solve them on multiple processors at the same time, varying input parameters Example BLAST, Monte Carlo simulation Use of Load Schedulers  Condor, Codine, LSF, Sun Grid Engine, SQMS
High Throughput Computing (con) Pros and Cons Easy to get started. Use the sequential code in C or Fortran.
Excellence for many type of applications such as Parametric computing: Running the same computation with multiple data set
Distributed application such as massive rendering in animation industry Excellence when model can fit well in memory of a single computer No communication at all
High Performance Computing Maximum performance, not maximum throughput
Use of specialised codes, libraries MPI (Message Passing Interface)
Parallel Maths Libraries (ScaLapack) Solve large problem by breaking it in to a number of small problems (data or task partitioning), then solve them on distributed, multiple processors at the same time.
Pros and Cons Difficult since a parallel program must be developed
Good when Problem is larger than memory size of a single machines
Speedup for a single instance of problem  is needed
Advantages and Challenges Advantages Highly scalable, light weight, easy setup
Plenty of free software Challenges Require a very highly trained, skill personal to maintain the system
No powerful software development environment
Low compatibility with many enterprise computing environment
Parallel Application Development Shared memory – data is exchanged using memory reference
Message passing – data is exchanged by sending/receiving messages between processors
Workload Scheduler Or “Job scheduler” or “Load scheduler”
Main role of distributed computing
Allow users to share computing resources and time sharing Unify resources in the cluster in to a shared resource pool
Control shared resource usage for multiple users Job queue
Scheduling Policies  Utilize resources efficiently
Hide the complexity of using cluster's computing resources by submitting job to the scheduler
Key Features Resources Control Where are the resources?
How many we can use?

More Related Content

PDF
LCA13: LAVA and CI Component Review
PDF
Q4.11: Getting Started in LAVA
PPTX
Java ain't scary - introducing Java to PL/SQL Developers
PDF
Kafka monitoring and metrics
ODP
Using Grails to power your electric car
PDF
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
PPT
Find bottleneck and tuning in Java Application
PDF
Performance Test Driven Development with Oracle Coherence
LCA13: LAVA and CI Component Review
Q4.11: Getting Started in LAVA
Java ain't scary - introducing Java to PL/SQL Developers
Kafka monitoring and metrics
Using Grails to power your electric car
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
Find bottleneck and tuning in Java Application
Performance Test Driven Development with Oracle Coherence

What's hot (20)

PDF
Effective testing for spark programs Strata NY 2015
PDF
Adding replication protocol support for psycopg2
PDF
Virtualizing Java in Java (jug.ru)
PDF
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
PDF
Inside the JVM - Follow the white rabbit! / Breizh JUG
PDF
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
PDF
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
PDF
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
PDF
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
PDF
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
PPT
whats new in java 8
PDF
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
PPT
Spark stream - Kafka
PDF
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
PDF
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PDF
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
PDF
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PDF
Cassandra - lesson learned
PDF
Monitoring with Prometheus
PDF
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Effective testing for spark programs Strata NY 2015
Adding replication protocol support for psycopg2
Virtualizing Java in Java (jug.ru)
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
Inside the JVM - Follow the white rabbit! / Breizh JUG
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
whats new in java 8
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Spark stream - Kafka
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Cassandra - lesson learned
Monitoring with Prometheus
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Ad

Similar to Introduction to LAVA Workload Scheduler (20)

PDF
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...
PPTX
OS for AI: Elastic Microservices & the Next Gen of ML
PPTX
Apache Cassandra 2.0
PDF
Container orchestration from theory to practice
PDF
Kubernetes for the PHP developer
PPTX
Introduction To Apache Mesos
PDF
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
PDF
Building Continuous Application with Structured Streaming and Real-Time Data ...
KEY
Capistrano, Puppet, and Chef
PDF
A DevOps guide to Kubernetes
PPTX
Configuring Your First Hadoop Cluster On EC2
PPTX
Docker Swarm secrets for creating great FIWARE platforms
PDF
LibOS as a regression test framework for Linux networking #netdev1.1
PPTX
A Fabric/Puppet Build/Deploy System
PPTX
Using R on High Performance Computers
PPTX
Copper: A high performance workflow engine
ODP
AutoScaling and Drupal
PPTX
NodeJS guide for beginners
PPTX
Typesafe spark- Zalando meetup
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...
OS for AI: Elastic Microservices & the Next Gen of ML
Apache Cassandra 2.0
Container orchestration from theory to practice
Kubernetes for the PHP developer
Introduction To Apache Mesos
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
Building Continuous Application with Structured Streaming and Real-Time Data ...
Capistrano, Puppet, and Chef
A DevOps guide to Kubernetes
Configuring Your First Hadoop Cluster On EC2
Docker Swarm secrets for creating great FIWARE platforms
LibOS as a regression test framework for Linux networking #netdev1.1
A Fabric/Puppet Build/Deploy System
Using R on High Performance Computers
Copper: A high performance workflow engine
AutoScaling and Drupal
NodeJS guide for beginners
Typesafe spark- Zalando meetup
Ad

Recently uploaded (20)

PPTX
Module 1 Introduction to Web Programming .pptx
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
LMS bot: enhanced learning management systems for improved student learning e...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
PDF
Build Real-Time ML Apps with Python, Feast & NoSQL
PDF
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PPTX
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
Human Computer Interaction Miterm Lesson
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Module 1 Introduction to Web Programming .pptx
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
LMS bot: enhanced learning management systems for improved student learning e...
EIS-Webinar-Regulated-Industries-2025-08.pdf
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
Build Real-Time ML Apps with Python, Feast & NoSQL
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Co-training pseudo-labeling for text classification with support vector machi...
Rapid Prototyping: A lecture on prototyping techniques for interface design
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Early detection and classification of bone marrow changes in lumbar vertebrae...
SGT Report The Beast Plan and Cyberphysical Systems of Control
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
Advancing precision in air quality forecasting through machine learning integ...
Human Computer Interaction Miterm Lesson
MuleSoft-Compete-Deck for midddleware integrations
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf

Introduction to LAVA Workload Scheduler