Federated Galaxy: Biomedical
Computing at the Frontier
Enis Afgan, Vahid Jalili,
Nuwan Goonasekera,
James Taylor, Jeremy Goecks
IEEE Cloud 2018
San Francisco, CA ~ Jul 2018
We are hiring so if you like what you hear, come talk to me.
What is ?
Describe analysis tool
behavior abstractly
Analysis environment automatically
and transparently tracks details
Workflow system for complex analysis,
constructed explicitly or automatically
Pervasive sharing, and publication
of documents with integrated analysis
Galaxy Main (usegalaxy.org) is popular
125,000
registered users
2PB
user data
19M
jobs run
100
training events
(2017 & 2018)
Stats for Galaxy Main (usegalaxy.org) in May 2018
Galaxy is Popular
Globally, and
Growing
Afgan, E., et al., “The Galaxy platform for accessible,
reproducible and collaborative biomedical analyses: 2018
update”, Nucleic Acids Research (NAR),
DOI: 10.1093/nar/gky379, May 2018.
18%
Servers
(99)
Availability of Galaxy
Usability
Flexibility
Public
servers
Cloud
servers
Local
installations
Production systems require a sophisticated software stack
Robust
database
Job
manager
FTP
server
Shared data External
resources
Galaxy-as-a-Service
Front-end: attach compute resources to a session
Architecting GaaS
- CloudBridge - abstracts multiple IaaS cloud providers to a consistent API
- https://cloudbridge.cloudve.org
- CloudLaunch - a configurable launcher with a UI, API, and CLI
- https://launch.usegalaxy.org → for a preview of the stack, try launching ‘CloudMan 2.0’ app
- CloudMan - a runtime manager for the underlying infrastructure
- https://github.com/galaxyproject/cloudman/tree/v2.0
- HelmsMan - an application deployment managed for Helm
- https://github.com/galaxyproject/cloudman/tree/v2.0
- CloudAuthz - OIDC token broker for multiple clouds
- https://github.com/galaxyproject/cloudauthz
Principled approach via abstractions: none are specific to Galaxy
Participation and contributions are welcome!
Services in action
Bootstrap via
CloudLaunch
>_ run
VM IP
CloudBridge
AWS Azure GCE OpenStack
CloudLaunch-plugin
galaxy/cloudman-bootcloudman-boot → Rancher K8S Helm
CloudMan
chart CloudBridge CloudLaunch CloudMan HelmsMan
Multi-cloud Infrastructure Coordination Applications
VM
...
...
...
...
Pulsar
Chart
Remote
object store(s)
Local
cache
Authn / authz
A
uthnz
Authnz
Containerized jobs
XSEDE, Indiana University
XSEDE & CyVerse,
TACC, Austin
EU JRC, Ispra
Penn State
cvmfs0-tacc0
● test.galaxyproject.org
● main.galaxyproject.org
cvmfs1-tacc0
cvmfs1-iu0
● Stratum 0 servers
● Stratum 1 servers
galaxy.jrc.ec.europa.eu
de.NBI, RZ Freiburg
cvmfs0-psu0
● singularity.galaxyproject.org
● data.galaxyproject.org
cvmfs1-psu0
cvmfs1-ufr0.usegalaxy.eu
Assembling a global infrastructure
GalaxyAustralia,Melbourne
cvmfs1-mel0.gvl.org.au
Acknowledgments

Federated Galaxy: Biomedical Computing at the Frontier

  • 1.
    Federated Galaxy: Biomedical Computingat the Frontier Enis Afgan, Vahid Jalili, Nuwan Goonasekera, James Taylor, Jeremy Goecks IEEE Cloud 2018 San Francisco, CA ~ Jul 2018 We are hiring so if you like what you hear, come talk to me.
  • 2.
  • 3.
    Describe analysis tool behaviorabstractly Analysis environment automatically and transparently tracks details Workflow system for complex analysis, constructed explicitly or automatically Pervasive sharing, and publication of documents with integrated analysis
  • 4.
    Galaxy Main (usegalaxy.org)is popular 125,000 registered users 2PB user data 19M jobs run 100 training events (2017 & 2018) Stats for Galaxy Main (usegalaxy.org) in May 2018
  • 5.
    Galaxy is Popular Globally,and Growing Afgan, E., et al., “The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update”, Nucleic Acids Research (NAR), DOI: 10.1093/nar/gky379, May 2018. 18% Servers (99)
  • 6.
  • 7.
    Production systems requirea sophisticated software stack Robust database Job manager FTP server Shared data External resources
  • 8.
  • 9.
    Front-end: attach computeresources to a session
  • 10.
  • 11.
    - CloudBridge -abstracts multiple IaaS cloud providers to a consistent API - https://cloudbridge.cloudve.org - CloudLaunch - a configurable launcher with a UI, API, and CLI - https://launch.usegalaxy.org → for a preview of the stack, try launching ‘CloudMan 2.0’ app - CloudMan - a runtime manager for the underlying infrastructure - https://github.com/galaxyproject/cloudman/tree/v2.0 - HelmsMan - an application deployment managed for Helm - https://github.com/galaxyproject/cloudman/tree/v2.0 - CloudAuthz - OIDC token broker for multiple clouds - https://github.com/galaxyproject/cloudauthz Principled approach via abstractions: none are specific to Galaxy Participation and contributions are welcome!
  • 12.
  • 13.
    Bootstrap via CloudLaunch >_ run VMIP CloudBridge AWS Azure GCE OpenStack CloudLaunch-plugin galaxy/cloudman-bootcloudman-boot → Rancher K8S Helm CloudMan chart CloudBridge CloudLaunch CloudMan HelmsMan Multi-cloud Infrastructure Coordination Applications VM ... ... ... ... Pulsar Chart Remote object store(s) Local cache Authn / authz A uthnz Authnz Containerized jobs
  • 14.
    XSEDE, Indiana University XSEDE& CyVerse, TACC, Austin EU JRC, Ispra Penn State cvmfs0-tacc0 ● test.galaxyproject.org ● main.galaxyproject.org cvmfs1-tacc0 cvmfs1-iu0 ● Stratum 0 servers ● Stratum 1 servers galaxy.jrc.ec.europa.eu de.NBI, RZ Freiburg cvmfs0-psu0 ● singularity.galaxyproject.org ● data.galaxyproject.org cvmfs1-psu0 cvmfs1-ufr0.usegalaxy.eu Assembling a global infrastructure GalaxyAustralia,Melbourne cvmfs1-mel0.gvl.org.au
  • 15.