SlideShare a Scribd company logo
Research Computing at ILRI
Alan Orth
ICT Managers Meeting, ILRI, Kenya, 5 March 2014
Where we came from (2003)
- 32 dual-core compute
nodes
- 32 * 2 != 64
- Writing MPI code is hard!
- Data storage over NFS to
“master” node
- “Rocks” cluster distro
- Revolutionary at the time!
Where we came from (2010)
- Most of the original cluster
removed
- Replaced with single
Dell PowerEdge R910
- 64 cores, 8TB storage, 128 GB
- Threading is easier* than MPI!
- Data is local
- Easier to manage!
To infinity and beyond (2013)
- A little bit back to the
“old” model
- Mixture of “thin” and
“thick” nodes
- Networked storage
- Pure CentOS
- Supermicro boxen
- Pretty exciting! --->
Primary characteristics

Computational
capacity

Data storage
Platform
- 152 compute cores
- 32* TB storage
- 700 GB RAM
- 10 GbE interconnects
- LTO-4 tape backups (LOL?)
Homogeneous computing environment

User IDs, applications, and data are available
everywhere.
Scaling out storage with GlusterFS
- Developed by Red Hat
- Abstracts backend storage (file systems,
technology, etc)
- Can do replicate, distribute,
replicate+distribute, geo-replication (off site!),
etc
- Scales “out”, not “up”
How we use GlusterFS
[aorth@hpc: ~]$ df -h
Filesystem
Size
...
wingu1:/homes
31T
wingu0:/apps
31T
wingu1:/data
31T

Used Avail Use% Mounted on
9.5T
9.5T
9.5T

21T
21T
21T

32% /home
32% /export/apps
32% /export/data

- Persistent paths for homes, data, and
applications across the cluster.
- These volumes are replicated, so essentially
application-layer RAID1
GlusterFS <3 10GbE
- Project from Lawrence Livermore National Labs (LLNL)
- Manages resources
- Users request CPU, memory, and node allocations
- Queues / prioritizes jobs, logs usage, etc
- More like an accountant than a bouncer
Topology
How we use SLURM
- Can submit “batch” jobs (long-running jobs, invoke
program many times with different variables, etc)
- Can run “interactively” (something that needs keyboard
interaction)
Make it easy for users to do the “right thing”:
[aorth@hpc: ~]$ interactive -c 10
salloc: Granted job allocation 1080
[aorth@compute0: ~]$
Managing applications
- Environment modules - http://modules.
sourceforge.net
- Dynamically load support for packages in a
user’s environment
- Makes it easy to support multiple versions,
complicated packages with $PERL5LIB,
package dependencies, etc
Managing applications
Install once, use everywhere...
[aorth@hpc: ~]$ module avail blast
blast/2.2.25+ blast/2.2.26 blast/2.2.26+ blast/2.
2.28+
[aorth@hpc: ~]$ module load blast/2.2.28+
[aorth@hpc: ~]$ which blastn
/export/apps/blast/2.2.28+/bin/blastn
Works anywhere on the cluster!
Users and Groups
- Consistent UID/GIDs across systems
- LDAP + SSSD (also from Red Hat) is a great
match
- 389 LDAP works great with CentOS
- SSSD is simpler than pam_ldap and does
caching
More information and contact

a.orth@cgiar.org
http://hpc.ilri.cgiar.org/

More Related Content

PPTX
Managing Containerized HPC and AI Workloads on TSUBAME3.0
Ian Lumb
 
PPTX
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Ian Lumb
 
PPT
Setting up repositories: Technical Requirements, Repository Software, Metad...
Iryna Kuchma
 
PDF
Kudu and Rust
Dan Burkert
 
PPT
Setting up repositories
Iryna Kuchma
 
PPTX
#OSATH Deploy OpenStack: DevStack (cont'd)
Thanassis Parathyras
 
PDF
Elastic HBase on Mesos - HBaseCon 2015
Cosmin Lehene
 
PDF
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
OpenStack
 
Managing Containerized HPC and AI Workloads on TSUBAME3.0
Ian Lumb
 
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Ian Lumb
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Iryna Kuchma
 
Kudu and Rust
Dan Burkert
 
Setting up repositories
Iryna Kuchma
 
#OSATH Deploy OpenStack: DevStack (cont'd)
Thanassis Parathyras
 
Elastic HBase on Mesos - HBaseCon 2015
Cosmin Lehene
 
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
OpenStack
 

What's hot (19)

PDF
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
PPTX
HaaS: HPCC Systems as a Service – BYOD to the Cloud Party
HPCC Systems
 
PPTX
Exadata x2 ext
yangjx
 
PDF
Writing file system in CPython
delimitry
 
PDF
Glusterfs session #9 index xlator
Pranith Karampuri
 
PDF
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 
PDF
An introduction to Big-Data processing applying hadoop
Amir Sedighi
 
PPTX
Ansible OTC
Frank Kloeker
 
PDF
Atom: A cloud native deep learning platform at Supremind
Alluxio, Inc.
 
PPT
8a. How To Setup HBase with Docker
Fabio Fumarola
 
PPT
2 db2 instance creation
Ravikumar Nandigam
 
PPTX
Mongodb backup
Dharshan Rangegowda
 
PPTX
Php dba cache
Gjero Krsteski
 
PPT
More than UI
Sujith Krishnan
 
PDF
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
PPT
IBM DB2 LUW UDB DBA Training by www.etraining.guru
Ravikumar Nandigam
 
PPT
Cassandra4hadoop
Edward Capriolo
 
PDF
Mosix Cluster
Abhay Pai
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
HaaS: HPCC Systems as a Service – BYOD to the Cloud Party
HPCC Systems
 
Exadata x2 ext
yangjx
 
Writing file system in CPython
delimitry
 
Glusterfs session #9 index xlator
Pranith Karampuri
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 
An introduction to Big-Data processing applying hadoop
Amir Sedighi
 
Ansible OTC
Frank Kloeker
 
Atom: A cloud native deep learning platform at Supremind
Alluxio, Inc.
 
8a. How To Setup HBase with Docker
Fabio Fumarola
 
2 db2 instance creation
Ravikumar Nandigam
 
Mongodb backup
Dharshan Rangegowda
 
Php dba cache
Gjero Krsteski
 
More than UI
Sujith Krishnan
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
IBM DB2 LUW UDB DBA Training by www.etraining.guru
Ravikumar Nandigam
 
Cassandra4hadoop
Edward Capriolo
 
Mosix Cluster
Abhay Pai
 
Ad

Similar to Research computing at ILRI (20)

PPT
presentation el cluster0
Dennis Mungai
 
PDF
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
Codemotion
 
PDF
GlusterFs: a scalable file system for today's and tomorrow's big data
Roberto Franchini
 
PDF
Trying and evaluating the new features of GlusterFS 3.5
Keisuke Takahashi
 
PDF
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red_Hat_Storage
 
PDF
Sharing resources with non-Hadoop workloads
DataWorks Summit
 
PDF
State of the_gluster_-_lceu
Gluster.org
 
PDF
Storage as a Service with Gluster
Vijay Bellur
 
PDF
fall2013
Will Dixon
 
KEY
Storing and distributing data
Phil Cryer
 
PDF
Linux-HA with Pacemaker
Kris Buytaert
 
PDF
Linux-HA with Pacemaker
Kris Buytaert
 
PDF
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
CloudStack - Open Source Cloud Computing Project
 
PPTX
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red_Hat_Storage
 
ODP
The Future of GlusterFS and Gluster.org
John Mark Walker
 
ODP
Performance characterization in large distributed file system with gluster fs
Neependra Khare
 
PPTX
Cluster computer
Ashraful Hoda
 
PDF
The Future of GlusterFS and Gluster.org
John Mark Walker
 
PDF
Docker, Monitoring and SLURM Specific Visualisations
alherca1
 
PDF
Life as a GlusterFS Consultant with Ivan Rossi
Gluster.org
 
presentation el cluster0
Dennis Mungai
 
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
Codemotion
 
GlusterFs: a scalable file system for today's and tomorrow's big data
Roberto Franchini
 
Trying and evaluating the new features of GlusterFS 3.5
Keisuke Takahashi
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red_Hat_Storage
 
Sharing resources with non-Hadoop workloads
DataWorks Summit
 
State of the_gluster_-_lceu
Gluster.org
 
Storage as a Service with Gluster
Vijay Bellur
 
fall2013
Will Dixon
 
Storing and distributing data
Phil Cryer
 
Linux-HA with Pacemaker
Kris Buytaert
 
Linux-HA with Pacemaker
Kris Buytaert
 
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
CloudStack - Open Source Cloud Computing Project
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red_Hat_Storage
 
The Future of GlusterFS and Gluster.org
John Mark Walker
 
Performance characterization in large distributed file system with gluster fs
Neependra Khare
 
Cluster computer
Ashraful Hoda
 
The Future of GlusterFS and Gluster.org
John Mark Walker
 
Docker, Monitoring and SLURM Specific Visualisations
alherca1
 
Life as a GlusterFS Consultant with Ivan Rossi
Gluster.org
 
Ad

More from ILRI (20)

PPTX
How the small-scale low biosecurity sector could be transformed into a more b...
ILRI
 
PPTX
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI
 
PDF
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI
 
PPTX
A training, certification and marketing scheme for informal dairy vendors in ...
ILRI
 
PDF
Milk safety and child nutrition impacts of the MoreMilk training, certificati...
ILRI
 
PPTX
Preventing the next pandemic: a 12-slide primer on emerging zoonotic diseases
ILRI
 
PPTX
Preventing preventable diseases: a 12-slide primer on foodborne disease
ILRI
 
PPTX
Preventing a post-antibiotic era: a 12-slide primer on antimicrobial resistance
ILRI
 
PPTX
Food safety research in low- and middle-income countries
ILRI
 
PPTX
Food safety research LMIC
ILRI
 
PPTX
The application of One Health: Observations from eastern and southern Africa
ILRI
 
PDF
One Health in action: Perspectives from 10 years in the field
ILRI
 
PPTX
Reservoirs of pathogenic Leptospira species in Uganda
ILRI
 
PDF
Minyoo ya mbwa
ILRI
 
PDF
Parasites in dogs
ILRI
 
PDF
Assessing meat microbiological safety and associated handling practices in bu...
ILRI
 
PDF
Ecological factors associated with abundance and distribution of mosquito vec...
ILRI
 
PPTX
Livestock in the agrifood systems transformation
ILRI
 
PDF
Development of a fluorescent RBL reporter system for diagnosis of porcine cys...
ILRI
 
PDF
Practices and drivers of antibiotic use in Kenyan smallholder dairy farms
ILRI
 
How the small-scale low biosecurity sector could be transformed into a more b...
ILRI
 
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI
 
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI
 
A training, certification and marketing scheme for informal dairy vendors in ...
ILRI
 
Milk safety and child nutrition impacts of the MoreMilk training, certificati...
ILRI
 
Preventing the next pandemic: a 12-slide primer on emerging zoonotic diseases
ILRI
 
Preventing preventable diseases: a 12-slide primer on foodborne disease
ILRI
 
Preventing a post-antibiotic era: a 12-slide primer on antimicrobial resistance
ILRI
 
Food safety research in low- and middle-income countries
ILRI
 
Food safety research LMIC
ILRI
 
The application of One Health: Observations from eastern and southern Africa
ILRI
 
One Health in action: Perspectives from 10 years in the field
ILRI
 
Reservoirs of pathogenic Leptospira species in Uganda
ILRI
 
Minyoo ya mbwa
ILRI
 
Parasites in dogs
ILRI
 
Assessing meat microbiological safety and associated handling practices in bu...
ILRI
 
Ecological factors associated with abundance and distribution of mosquito vec...
ILRI
 
Livestock in the agrifood systems transformation
ILRI
 
Development of a fluorescent RBL reporter system for diagnosis of porcine cys...
ILRI
 
Practices and drivers of antibiotic use in Kenyan smallholder dairy farms
ILRI
 

Recently uploaded (20)

PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
The Future of Artificial Intelligence (AI)
Mukul
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Software Development Methodologies in 2025
KodekX
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 

Research computing at ILRI

  • 1. Research Computing at ILRI Alan Orth ICT Managers Meeting, ILRI, Kenya, 5 March 2014
  • 2. Where we came from (2003) - 32 dual-core compute nodes - 32 * 2 != 64 - Writing MPI code is hard! - Data storage over NFS to “master” node - “Rocks” cluster distro - Revolutionary at the time!
  • 3. Where we came from (2010) - Most of the original cluster removed - Replaced with single Dell PowerEdge R910 - 64 cores, 8TB storage, 128 GB - Threading is easier* than MPI! - Data is local - Easier to manage!
  • 4. To infinity and beyond (2013) - A little bit back to the “old” model - Mixture of “thin” and “thick” nodes - Networked storage - Pure CentOS - Supermicro boxen - Pretty exciting! --->
  • 6. Platform - 152 compute cores - 32* TB storage - 700 GB RAM - 10 GbE interconnects - LTO-4 tape backups (LOL?)
  • 7. Homogeneous computing environment User IDs, applications, and data are available everywhere.
  • 8. Scaling out storage with GlusterFS - Developed by Red Hat - Abstracts backend storage (file systems, technology, etc) - Can do replicate, distribute, replicate+distribute, geo-replication (off site!), etc - Scales “out”, not “up”
  • 9. How we use GlusterFS [aorth@hpc: ~]$ df -h Filesystem Size ... wingu1:/homes 31T wingu0:/apps 31T wingu1:/data 31T Used Avail Use% Mounted on 9.5T 9.5T 9.5T 21T 21T 21T 32% /home 32% /export/apps 32% /export/data - Persistent paths for homes, data, and applications across the cluster. - These volumes are replicated, so essentially application-layer RAID1
  • 11. - Project from Lawrence Livermore National Labs (LLNL) - Manages resources - Users request CPU, memory, and node allocations - Queues / prioritizes jobs, logs usage, etc - More like an accountant than a bouncer
  • 13. How we use SLURM - Can submit “batch” jobs (long-running jobs, invoke program many times with different variables, etc) - Can run “interactively” (something that needs keyboard interaction) Make it easy for users to do the “right thing”: [aorth@hpc: ~]$ interactive -c 10 salloc: Granted job allocation 1080 [aorth@compute0: ~]$
  • 14. Managing applications - Environment modules - http://modules. sourceforge.net - Dynamically load support for packages in a user’s environment - Makes it easy to support multiple versions, complicated packages with $PERL5LIB, package dependencies, etc
  • 15. Managing applications Install once, use everywhere... [aorth@hpc: ~]$ module avail blast blast/2.2.25+ blast/2.2.26 blast/2.2.26+ blast/2. 2.28+ [aorth@hpc: ~]$ module load blast/2.2.28+ [aorth@hpc: ~]$ which blastn /export/apps/blast/2.2.28+/bin/blastn Works anywhere on the cluster!
  • 16. Users and Groups - Consistent UID/GIDs across systems - LDAP + SSSD (also from Red Hat) is a great match - 389 LDAP works great with CentOS - SSSD is simpler than pam_ldap and does caching
  • 17. More information and contact [email protected] http://hpc.ilri.cgiar.org/