SlideShare a Scribd company logo
Thoughts on Cybersecurity
informed by large international
science & the Open Science Grid
Frank Würthwein
OSG Executive Director
UCSD/SDSC
Let’s reset your perception first
Jensen Huang keynote @ SC19
3
The Largest Cloud Simulation in History
50k NVIDIA GPUs in the Cloud
350 Petaflops for 2 hours
Distributed across US, Europe & Asia
Saturday morning before SC19 we bought all GPU capacity that was for sale in
Amazon Web Services, Microsoft Azure, and Google Cloud Platform worldwide
Science with 51,000 GPUs
achieved as peak performance
4
Time in Minutes
Each color is a different
cloud region in US, EU, or Asia.
Total of 28 Regions in use.
Peaked at 51,500 GPUs
~380 Petaflops of fp32
I can purchase a 300PFLOP32 hour in the cloud for $15k today
and nobody asks me any questions about cybersecurity.
• Nothing about my nationality or visa or …
• Nothing about two-factor authentication or my software
• Everything is wide open on the internet
Should cybersecurity requirements imposed
on open academic research executed at on-
prem resources be adjusted to the realities of
executing the same research on cloud
resources ?
Science is an International Team Sport
Science is a Team Sport
7
The ATLAS Collaboration
8
∼200 institutions across ∼40 countries
Cybersecurity enabling Science
• Humanity has built extraordinary instruments by
pooling human and financial resources globally.
• To derive science from the data and simulations
for those instruments requires globally
integrated Cyberinfrastructure.
• Cybersecurity is enabling this science.
 Policy framework
 Operational security
 Infrastructure software
9
Disk space use per site by CMS
XENON Collaboration as
a “Midscale” Instrument Example
XENON1T Storage & Processing
Challenge
• Experiment in Gran Sasso, Italy
• Tape Archive in Sweden
• Disk storage in 7 locations across Holland, Italy,
Israel, France, USA
 Petabyte of data divided into 20k datasets
• Compute sites on EGI, OSG, and NSF HPC
allocation
11
OSG took on the integration challenge
via ”embedded” technical support.
XENON1T Globally
Integrated Infrastructure
12
NIKHEF
Amsterdam
SURFSara
Amsterdam
Comet
XD Allocation
IN2P3
Lyon
Weizman
Tel Aviv
OSG integrates HPC allocations, contributions from collaborators,
and opportunistic capacity into a single platform to do science on.
Resource Federation
OSG Compute Federation
14
OSG federates
~200 clusters
worldwide
Owners determine
policy of use.
Many allow
opportunistic use
of spare capacity.
> 2 Billion CPU core hours per year
Federation Principle
• Any provider can bring their resources to the
table.
• Truth in advertising:
 Resource providers accurately specify (some)
details about the resource.
• Any consumer can decide which of the
available resources they are willing to use.
15
OSG matches consumers to providers globally
following policies expressed locally.
“NETFLIX” for Open Science
• NETFLIX operates a CDN, providing streaming access to
searchable curated data from anywhere at anytime to any
subscriber.
• For open science, the CDN needs to (in addition) be federated.
 Anybody can share their data from their locally owned data origin into the
CDN.
 Data Access is mediated via caches in the network and at endpoints to
minimize requirements on origins to maximally stimulate sharing.
 Performance of data access is determined by location and performance of
the closest cache rather than the data’s origin.
 Locally defined and managed groups of users share data securely with
each other globally. Data access is global.
16
Locally defined policies are enforced globally by the CDN
The OSG Data Federation
17
Cur r ent st ashcache infr ast r uct ur e (US)
GaTech
We operate a production “prototype” of such a CDN
Two Challenges to think about
Authz: Person vs Capability
• Operations teams are a mix of ”permanent”
staff and transients.
 E.g. CERN pays for ”Operators” funded via
”authorship fees”.
• Delegating a person’s identity to a computing
activity in order to authenticate the activity at a
remote server makes little sense.
• Delegating a capability to a computing activity
in order to authenticate it at a remote server
makes a lot of sense.
19
Division of Responsibility
• To maximize the capacity provided we need
to minimize the effort required to provide it.
• The services required for the CDN and/or
compute federations are specialized and
non-trivial.
 Large learning curve to achieve low cost
operations.
20
Service Operations is most (cost) effective
when separated from hardware operations
Network Cache Ops Model
• OSG supports the researchers
using the Data Federation
• OSG deploys & operates the
caching middleware.
• PRP, TNRP, I2, Regionals, …
responsible for network
performance.
• Hardware owners operate
hardware, OS install, and join
K8S for container orchestration.
21
Science Applications
Data Federation Services
Network Performance
Hardware & OS
A layered approach to distributed DevOps Responsibility
Cybersecurity Issues (I)
• Hardware owners only provide hardware
 Deploy OS and Kubernetes.
• Service Operators (I)
 A team that operates the K8S cluster.
• Service Operators (II)
 A team that deploys and operates the CDN service as
containers inside (and across generally multiple) K8S
clusters.
• Software Operations
 A team that provides the container images
22
How do you design a security model that supports this structure?
Cybersecurity Issues (II)
• Container Security Model
• Security Model that allows hardware owners to give service
responsibility to service operators.
 Diverse requirements
 Some institutions will want to operate their own K8S simply because of the
level of control that implies.
 Others won’t because of the level of effort it requires.
 How do DOE and other National Labs fit into this?
 How can a service provider in the US operate a service on hardware in
EU and Asia? Or vice versa.
 What about India, Pakistan, China, Iran, … pick your favorite country ….
 How to deal with institutions that require US Citizenship even for SUDO
access?
23
The set of issues and diversity of constraints seems endless
And now think back to the beginning: All of this is trivial in the cloud!!!
Summary & Conclusions
24
• Humanity has built extraordinary instruments by
pooling human and financial resources globally.
• To derive science from the data and simulations
for those instruments requires globally
integrated Cyberinfrastructure.
• Cybersecurity is enabling this science.
 Policy framework
 Operational security
 Infrastructure software
Contact us at: help@opensciencegrid.org
Or me personally at: fkw@ucsd.edu
Acknowledgements
• This work was partially supported by the
NSF grants OAC-1941481, MPS-1148698,
OAC-1841530, OAC-1904444, and OAC-
1826967
25

More Related Content

PDF
Bergman Enabling Computation for neuro ML external
azlefty
 
PDF
LambdaFabric for Machine Learning Acceleration
KnuEdge
 
PPTX
The Pacific Research Platform Two Years In
Larry Smarr
 
DOCX
Grid computing assiment
Huma Tariq
 
PDF
WekaIO: Making Machine Learning Compute Bound Again
inside-BigData.com
 
PPTX
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Storage Switzerland
 
PDF
Long Live Posix - HPC Storage and the HPC Datacenter
inside-BigData.com
 
PPTX
Welcome to the 2018 Stanford HPC Conference
inside-BigData.com
 
Bergman Enabling Computation for neuro ML external
azlefty
 
LambdaFabric for Machine Learning Acceleration
KnuEdge
 
The Pacific Research Platform Two Years In
Larry Smarr
 
Grid computing assiment
Huma Tariq
 
WekaIO: Making Machine Learning Compute Bound Again
inside-BigData.com
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Storage Switzerland
 
Long Live Posix - HPC Storage and the HPC Datacenter
inside-BigData.com
 
Welcome to the 2018 Stanford HPC Conference
inside-BigData.com
 

What's hot (20)

PPT
Grid
FajarZain
 
PDF
At the Crossroads of HPC and Cloud Computing with Openstack
Ryan Aydelott
 
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
PPTX
HPC Top 5 Stories: January 12, 2018
NVIDIA
 
PDF
Cloud Computing in D-Grid
Stefan Freitag
 
PDF
CloudLab Overview
Ed Dodds
 
PDF
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Alan Sill
 
PPT
Grid Presentation
Marielisa Peralta
 
PPTX
Open Science Data Cloud (IEEE Cloud 2011)
Robert Grossman
 
PPTX
Bionimbus - Northwestern CGI Workshop 4-21-2011
Robert Grossman
 
PDF
OGF Standards Overview - ITU-T JCA Cloud
Alan Sill
 
PPTX
Open Science Data Cloud - CCA 11
Robert Grossman
 
PDF
OGF Introductory Overview - FAS* 2014
Alan Sill
 
PPTX
Cloud vs grid
Omid Sohrabi
 
PPT
Globus toolkit in grid
Deevena Dayaal
 
PPT
Grid computing ppt 2003(done)
TASNEEM88
 
PDF
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
Edge AI and Vision Alliance
 
PDF
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
Alan Sill
 
PDF
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
Grid
FajarZain
 
At the Crossroads of HPC and Cloud Computing with Openstack
Ryan Aydelott
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
HPC Top 5 Stories: January 12, 2018
NVIDIA
 
Cloud Computing in D-Grid
Stefan Freitag
 
CloudLab Overview
Ed Dodds
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Alan Sill
 
Grid Presentation
Marielisa Peralta
 
Open Science Data Cloud (IEEE Cloud 2011)
Robert Grossman
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Robert Grossman
 
OGF Standards Overview - ITU-T JCA Cloud
Alan Sill
 
Open Science Data Cloud - CCA 11
Robert Grossman
 
OGF Introductory Overview - FAS* 2014
Alan Sill
 
Cloud vs grid
Omid Sohrabi
 
Globus toolkit in grid
Deevena Dayaal
 
Grid computing ppt 2003(done)
TASNEEM88
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
Edge AI and Vision Alliance
 
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
Alan Sill
 
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
Ad

Similar to Thoughts on Cybersecurity (20)

PDF
Frank Würthwein - NRP and the Path forward
Larry Smarr
 
PPT
GRID COMPUTING.ppt
4173CarreonIraMaeL
 
PDF
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
PDF
Cloud computing infrastructure
Dr. Anita Goel
 
PPT
Grid computing
Keshab Nath
 
PPTX
Panel: NRP Science Impacts​
Larry Smarr
 
PPTX
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
PPTX
Cloud and Grid Computing
Leen Blom
 
PPTX
Cloud and grid computing by Leen Blom, Centric
Centric
 
PDF
ZCloud Consensus on Hardware for Distributed Systems
Gokhan Boranalp
 
PPTX
Cloud Busting: Understanding Cloud-based Digital Forensics
Kerry Hazelton
 
PPTX
e-Infrastructure available for research, using the right tool for the right job
David Wallom
 
PPT
Grid Computing
sharmili priyadarsini
 
PPTX
Data-intensive bioinformatics on HPC and Cloud
Ola Spjuth
 
PDF
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
PDF
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
PDF
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
PDF
Raspberry pi x kubernetes x tensorflow
霈萱 蔡
 
PDF
WF-IOT-2014, Seoul, Korea, 06 March 2014
Charith Perera
 
Frank Würthwein - NRP and the Path forward
Larry Smarr
 
GRID COMPUTING.ppt
4173CarreonIraMaeL
 
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
Cloud computing infrastructure
Dr. Anita Goel
 
Grid computing
Keshab Nath
 
Panel: NRP Science Impacts​
Larry Smarr
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
Cloud and Grid Computing
Leen Blom
 
Cloud and grid computing by Leen Blom, Centric
Centric
 
ZCloud Consensus on Hardware for Distributed Systems
Gokhan Boranalp
 
Cloud Busting: Understanding Cloud-based Digital Forensics
Kerry Hazelton
 
e-Infrastructure available for research, using the right tool for the right job
David Wallom
 
Grid Computing
sharmili priyadarsini
 
Data-intensive bioinformatics on HPC and Cloud
Ola Spjuth
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
Raspberry pi x kubernetes x tensorflow
霈萱 蔡
 
WF-IOT-2014, Seoul, Korea, 06 March 2014
Charith Perera
 
Ad

Recently uploaded (20)

PPTX
mirna_2025_clase_genética_cinvestav_Dralvarez
Cinvestav
 
PPTX
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
PPTX
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
PDF
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
PDF
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
PDF
Paleoseismic activity in the moon’s Taurus-Littrowvalley inferred from boulde...
Sérgio Sacani
 
PPTX
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
DOCX
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PDF
Control and coordination Class 10 Chapter 6
LataHolkar
 
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
PDF
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
PPTX
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
PPTX
Internal Capsule_Divisions_fibres_lesions
muralinath2
 
PPTX
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
PPT
Grade_9_Science_Atomic_S_t_r_u_cture.ppt
QuintReynoldDoble
 
PDF
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 
PDF
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 
mirna_2025_clase_genética_cinvestav_Dralvarez
Cinvestav
 
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
Paleoseismic activity in the moon’s Taurus-Littrowvalley inferred from boulde...
Sérgio Sacani
 
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
Control and coordination Class 10 Chapter 6
LataHolkar
 
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
Internal Capsule_Divisions_fibres_lesions
muralinath2
 
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
Grade_9_Science_Atomic_S_t_r_u_cture.ppt
QuintReynoldDoble
 
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 

Thoughts on Cybersecurity

  • 1. Thoughts on Cybersecurity informed by large international science & the Open Science Grid Frank Würthwein OSG Executive Director UCSD/SDSC
  • 2. Let’s reset your perception first
  • 3. Jensen Huang keynote @ SC19 3 The Largest Cloud Simulation in History 50k NVIDIA GPUs in the Cloud 350 Petaflops for 2 hours Distributed across US, Europe & Asia Saturday morning before SC19 we bought all GPU capacity that was for sale in Amazon Web Services, Microsoft Azure, and Google Cloud Platform worldwide
  • 4. Science with 51,000 GPUs achieved as peak performance 4 Time in Minutes Each color is a different cloud region in US, EU, or Asia. Total of 28 Regions in use. Peaked at 51,500 GPUs ~380 Petaflops of fp32 I can purchase a 300PFLOP32 hour in the cloud for $15k today and nobody asks me any questions about cybersecurity. • Nothing about my nationality or visa or … • Nothing about two-factor authentication or my software • Everything is wide open on the internet
  • 5. Should cybersecurity requirements imposed on open academic research executed at on- prem resources be adjusted to the realities of executing the same research on cloud resources ?
  • 6. Science is an International Team Sport
  • 7. Science is a Team Sport 7
  • 8. The ATLAS Collaboration 8 ∼200 institutions across ∼40 countries
  • 9. Cybersecurity enabling Science • Humanity has built extraordinary instruments by pooling human and financial resources globally. • To derive science from the data and simulations for those instruments requires globally integrated Cyberinfrastructure. • Cybersecurity is enabling this science.  Policy framework  Operational security  Infrastructure software 9 Disk space use per site by CMS
  • 10. XENON Collaboration as a “Midscale” Instrument Example
  • 11. XENON1T Storage & Processing Challenge • Experiment in Gran Sasso, Italy • Tape Archive in Sweden • Disk storage in 7 locations across Holland, Italy, Israel, France, USA  Petabyte of data divided into 20k datasets • Compute sites on EGI, OSG, and NSF HPC allocation 11 OSG took on the integration challenge via ”embedded” technical support.
  • 12. XENON1T Globally Integrated Infrastructure 12 NIKHEF Amsterdam SURFSara Amsterdam Comet XD Allocation IN2P3 Lyon Weizman Tel Aviv OSG integrates HPC allocations, contributions from collaborators, and opportunistic capacity into a single platform to do science on.
  • 14. OSG Compute Federation 14 OSG federates ~200 clusters worldwide Owners determine policy of use. Many allow opportunistic use of spare capacity. > 2 Billion CPU core hours per year
  • 15. Federation Principle • Any provider can bring their resources to the table. • Truth in advertising:  Resource providers accurately specify (some) details about the resource. • Any consumer can decide which of the available resources they are willing to use. 15 OSG matches consumers to providers globally following policies expressed locally.
  • 16. “NETFLIX” for Open Science • NETFLIX operates a CDN, providing streaming access to searchable curated data from anywhere at anytime to any subscriber. • For open science, the CDN needs to (in addition) be federated.  Anybody can share their data from their locally owned data origin into the CDN.  Data Access is mediated via caches in the network and at endpoints to minimize requirements on origins to maximally stimulate sharing.  Performance of data access is determined by location and performance of the closest cache rather than the data’s origin.  Locally defined and managed groups of users share data securely with each other globally. Data access is global. 16 Locally defined policies are enforced globally by the CDN
  • 17. The OSG Data Federation 17 Cur r ent st ashcache infr ast r uct ur e (US) GaTech We operate a production “prototype” of such a CDN
  • 18. Two Challenges to think about
  • 19. Authz: Person vs Capability • Operations teams are a mix of ”permanent” staff and transients.  E.g. CERN pays for ”Operators” funded via ”authorship fees”. • Delegating a person’s identity to a computing activity in order to authenticate the activity at a remote server makes little sense. • Delegating a capability to a computing activity in order to authenticate it at a remote server makes a lot of sense. 19
  • 20. Division of Responsibility • To maximize the capacity provided we need to minimize the effort required to provide it. • The services required for the CDN and/or compute federations are specialized and non-trivial.  Large learning curve to achieve low cost operations. 20 Service Operations is most (cost) effective when separated from hardware operations
  • 21. Network Cache Ops Model • OSG supports the researchers using the Data Federation • OSG deploys & operates the caching middleware. • PRP, TNRP, I2, Regionals, … responsible for network performance. • Hardware owners operate hardware, OS install, and join K8S for container orchestration. 21 Science Applications Data Federation Services Network Performance Hardware & OS A layered approach to distributed DevOps Responsibility
  • 22. Cybersecurity Issues (I) • Hardware owners only provide hardware  Deploy OS and Kubernetes. • Service Operators (I)  A team that operates the K8S cluster. • Service Operators (II)  A team that deploys and operates the CDN service as containers inside (and across generally multiple) K8S clusters. • Software Operations  A team that provides the container images 22 How do you design a security model that supports this structure?
  • 23. Cybersecurity Issues (II) • Container Security Model • Security Model that allows hardware owners to give service responsibility to service operators.  Diverse requirements  Some institutions will want to operate their own K8S simply because of the level of control that implies.  Others won’t because of the level of effort it requires.  How do DOE and other National Labs fit into this?  How can a service provider in the US operate a service on hardware in EU and Asia? Or vice versa.  What about India, Pakistan, China, Iran, … pick your favorite country ….  How to deal with institutions that require US Citizenship even for SUDO access? 23 The set of issues and diversity of constraints seems endless And now think back to the beginning: All of this is trivial in the cloud!!!
  • 24. Summary & Conclusions 24 • Humanity has built extraordinary instruments by pooling human and financial resources globally. • To derive science from the data and simulations for those instruments requires globally integrated Cyberinfrastructure. • Cybersecurity is enabling this science.  Policy framework  Operational security  Infrastructure software Contact us at: [email protected] Or me personally at: [email protected]
  • 25. Acknowledgements • This work was partially supported by the NSF grants OAC-1941481, MPS-1148698, OAC-1841530, OAC-1904444, and OAC- 1826967 25