SlideShare a Scribd company logo
2
Most read
4
Most read
Barts Health Data Platform
Author: Tony Wildish, Idowu Samuel Bioku, Evan Hann, Steven Newhouse, Benjamin Eaton,
Ruzena Uddin, Francene Clarke-Walden.​
​
Building a Production-Ready Barts
Health Secure Data Environment:
Tooling, Access Control, and Cost
Governance
Presenter: Idowu Samuel Bioku.
STEP-UP RS London 2025
Background
Barts Health Data Platform?
 The Barts Health Data Platform (BHDP) is an integrated health data analytics platform.
 Provides a Data Portal for researchers to apply for access to research ready NHS patient data.
 A Secure Data Environment (SDE) for researchers to find insights from the data they have been authorised to access.
 Provides researchers and clinicians access to the data we hold on patients in East London.
Why build a Secure Data Environment?
SDE is crucial to health research with sensitive data. It satisfies the obligation to keep it safe, legally compliant, protected,
and with the right tooling.
ü Security and Integrity of Data.
ü Build public trust over data handling.
ü Compliance and Legal Requirements.
ü Tooling to help researchers get started with significantly reduced overhead.
ü Give researchers freedom to use wide variety of tools (VMs, SQL, AI/ML).
The SDE is built upon Microsoft Azure's Trusted Research Environment (TRE)
The AzureTRE is an open-source accelerator template that provides the core architecture for secure data environment.
Benefits of AzureTRE
ü Integration with the Microsoft Entra ID.
ü Core Infrastructural templates (Terraform).
ü Core security architecture.
ü Flexibility and ease of integration with latest technologies.
Need for production-level enhancements
ü Support for Large-Scale Research: Barts Health's need to operate at scale of up to 150 concurrent projects to
accommodate wider scope research.
ü Ensuring Data Security and Compliance: As a leader in secure research environments, it’s critical to ensure data
security and compliance.
ü Operational Efficiency and Cost Control: Enhanced tooling for monitoring and resource management allows better
cost optimization and resource allocation, keeping the platform sustainable as usage grows.
ü Timely Support for Researchers: Technical support must scale to quickly resolve issues and minimize downtime.
What we have done
Secure Data Connection to
Research Workspaces​
 Secure Azure Data Factory
linked service. ​
 To enable secure transfer of
approved research data.​
Cost transparency and management​
 Granular project by project cost
tracking over the full project lifetime
(up to 3 years or more)​
.
 Highly Visible and transparent
costing.
Research use cases demand secure, scalable, and customised environments.
Custom VM images​
ü Pre-installed software and
configurations tailored to research
needs.
ü Ensures consistency and faster
provisioning across multiple projects
and OS flavours.
ü Regular updates as well as easy and
fast deployment of new tools.
Customised VM Images (1)
The customised VM images include pre-installed and pre-configured set of software and
configurations tailored to research needs.
Custom Images:
ü Windows 10
ü Windows 11
ü Windows Server 19
ü Ubuntu 22.04
ü Ubuntu 24.04
Pre-installed software, configuration and setup
 Dev Tools: Python, Anaconda, R, RStudio, VS Code, DotNet, GIT.
 Machine Learning Tools: Jupyter Notebooks, Azure Data Studio.
 DB Tools: MySQL, PostgresSQL DBeaver, SSMS, Storage Explorer
 DICOM viewers: Radiant, Spyder
 General: Google Chrome, FireFox, LibreOffice
Tailored for:​
 Machine Learning​
• Pre-installed ML frameworks optimized for GPU and CPU workloads.
• Integrated libraries for data preprocessing, feature engineering, and model deployment.
• Configured with GPU for accelerated training.
 Medical imaging​
• Includes specialised imaging software and toolkits e.g. DICOM viewers.
Default Images Our Solution – Custom Images
Deployment of new VMs takes time Faster deployment of new VMs
Lack of research dependent software Availability of research dependent
software
Inability to control updates to images Can control updates to images and
pre-installed software
Why build custom images?
Customised VM Images (2)
Impact​
ü Faster Deployment.​
ü Consistent environments.​
ü Faster onboarding.​
ü Access to varieties of tools.​
ü Faster research.​
Secure Data Connection to Research Workspace
Each research workspace is configured to ensure data access and transfer from the Analysis Data Core.​
​
Azure Data Factory:​
ü Enables automated and monitored data pipelines that transfer requested research data into the workspace.​
ü Uses encrypted connections and strict access controls to maintain data confidentiality and integrity during
transit.​
ü Supports complex data transformation, validation, and orchestration workflows tailored to research needs.​
​
Dedicated Private Endpoint:​
üEach workspace is provisioned with a dedicated private endpoint that provides a secure, private network
connection specifically between Azure Data Factory (ADF) and the workspace’s data storage.​
üThis private endpoint ensures that data transferred by ADF pipelines flows entirely within Azure’s private
network, eliminating exposure to the public internet and greatly enhancing data security during transit.​
Default Azure Cost Management
Portal
Enhanced Cost Management
Tooling
Cost data has limited retention
period
High availability of cost data that
even for decommissioned projects
beyond the default retention period.
Metadata of deleted resources has
limited retention period
Metadata of deleted resources are
highly and readily available.
Difficulties in mapping
costs appropriately to
their respective projects
Solves the difficulties of mapping
costs to their respective projects.
Provides granular cost visibility of
how much each resource costs in
each project
Cost Management Tooling​
ü The Cost Management is an Azure runbook-based solution.​
ü Runs weekly to aggregate cost data into AzureSQL.​
ü Helps researchers and TRE administrators to track resources costing, enabling smarter financial
decisions.​
Why is the cost Management Tool Important?​
​
Impact​
üProvides high visibility of expenses​
üHigh persistency even for completed or
decommissioned projects​
üHelps with making smarter research cost decision​
üGranularity and transparent billing​
​
Ongoing Work: RBAC integration
RBAC integration refers to embedding a permissions framework into the SDE that controls
the sets of tools available to researchers.
Why are we doing this?
ü To streamline governance across a growing number of concurrent research workspaces.
ü To enable precise control over tool access, ensuring only approved and secure software is available
based on role.
ü To reduce manual overhead in provisioning, updating, or removing tools across environments.
How will it impact users?
 Researchers will get simplified and relevant tool access, improving usability and focus.
 Admins will be able to centrally manage access to tools and enforce compliance standards.
 IT teams can roll out new tools, upgrades, or deprecations in a controlled, role-based way.
How will it impact the SDE?
 Allow SDE and workspace administrators to control the set of tools available to researchers.
 Make upgrades and deprecation of tools easier to manage.
Summary:
 Our solution addresses the growing needs of Barts Health to support large-scale, secure, and compliant research
environment.
 Delivered a ready-to-use data platform with extended possibilities and features.
 This has progressed in 2024 from Alpha in March, through Beta in June, to full production release in December.
Our deliveries to date:
 Custom VM Images tailored for machine learning, medical imaging, and complex health data workloads, ensuring
optimized, secure, and consistent compute environments.
 Secure Data Connection to Research Workspaces, featuring a secure Azure Data Factory integrations, and private
endpoints together ensuring safe, compliant, and reliable data transfer pipelines.
 Cost Management Tooling that provides transparency, cost granularity, and cost data persistency, helping to manage
research budget efficiently.
 Ongoing Production-Level Enhancements, including RBAC integration to provide fine-grained, role-based control over
tool access. This aims to improve security, simplifying administration, governance, and enabling scalable operations.
Building a Production-Ready Barts Health Secure Data Environment Tooling, Access Control, and Cost Governance_.pptx
bartshealth.researchdatarequest@nhs.net
https://data.bartshealth.nhs.uk/

More Related Content

PDF
RWE & Patient Analytics Leveraging Databricks – A Use Case
Databricks
 
PPTX
Next Generation Analytics: The Backbone of the High Performing Health System
Investnet
 
PPTX
Big Data at Geisinger Health System: Big Wins in a Short Time
DataWorks Summit
 
PDF
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
 
PPTX
Social Networks and Collaborative Platforms for Data Sharing in Radiology
Erik R. Ranschaert, MD, PhD
 
PPTX
Solution Architecture US healthcare
sumiteshkr
 
PPTX
BIG DATA USAGE IN Ehealth and how it can be used in this field
alafkh2
 
PPTX
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia Alliance
 
RWE & Patient Analytics Leveraging Databricks – A Use Case
Databricks
 
Next Generation Analytics: The Backbone of the High Performing Health System
Investnet
 
Big Data at Geisinger Health System: Big Wins in a Short Time
DataWorks Summit
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
 
Social Networks and Collaborative Platforms for Data Sharing in Radiology
Erik R. Ranschaert, MD, PhD
 
Solution Architecture US healthcare
sumiteshkr
 
BIG DATA USAGE IN Ehealth and how it can be used in this field
alafkh2
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia Alliance
 

Similar to Building a Production-Ready Barts Health Secure Data Environment Tooling, Access Control, and Cost Governance_.pptx (20)

PDF
Leveraging Data Analysis for Advancements in Healthcare and Medical Research.pdf
Soumodeep Nanee Kundu
 
PPTX
How a big company employs cutting edge tech
Steve Woodward
 
PPTX
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Dan Wellisch
 
PDF
The Fast Track to Fair Lab Data
OSTHUS
 
PDF
AP-Summary-Aug-09-2022_capabilities .pdf
kcdelllaptop
 
PDF
How Big Data can drive innovative technologies and new approaches in large or...
Nick Brown
 
PDF
Medical Imaging: 8 Opportunities for technology entrepreneurs and investors
Healthstartup
 
PDF
Tony Shannon: Health care change in the NHS: Practical considerations of VistA
Nuffield Trust
 
PPTX
Enterprise Analytics: Serving Big Data Projects for Healthcare
DATA360US
 
PDF
Anthony J brookes
Eduserv
 
PDF
Fair by design
Pistoia Alliance
 
PDF
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
PPTX
2018 10 igneous
Chris Dwan
 
PDF
Patient-Like-Mine
Simon Yates
 
PPTX
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
VMware Tanzu
 
PDF
Data Wrangling and the Art of Big Data Discovery
Inside Analysis
 
PDF
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET Journal
 
PDF
IRJET- Automated Health Care Management System using Big Data Technology
IRJET Journal
 
PPTX
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Aridhia Informatics Ltd
 
PPTX
Optimum Healthcare IT A physician’s perspective on Big Data, Predictive Analy...
HealthXn
 
Leveraging Data Analysis for Advancements in Healthcare and Medical Research.pdf
Soumodeep Nanee Kundu
 
How a big company employs cutting edge tech
Steve Woodward
 
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Dan Wellisch
 
The Fast Track to Fair Lab Data
OSTHUS
 
AP-Summary-Aug-09-2022_capabilities .pdf
kcdelllaptop
 
How Big Data can drive innovative technologies and new approaches in large or...
Nick Brown
 
Medical Imaging: 8 Opportunities for technology entrepreneurs and investors
Healthstartup
 
Tony Shannon: Health care change in the NHS: Practical considerations of VistA
Nuffield Trust
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
DATA360US
 
Anthony J brookes
Eduserv
 
Fair by design
Pistoia Alliance
 
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
2018 10 igneous
Chris Dwan
 
Patient-Like-Mine
Simon Yates
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
VMware Tanzu
 
Data Wrangling and the Art of Big Data Discovery
Inside Analysis
 
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET Journal
 
IRJET- Automated Health Care Management System using Big Data Technology
IRJET Journal
 
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Aridhia Informatics Ltd
 
Optimum Healthcare IT A physician’s perspective on Big Data, Predictive Analy...
HealthXn
 
Ad

Recently uploaded (20)

PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Doc9.....................................
SofiaCollazos
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Software Development Methodologies in 2025
KodekX
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
The Future of Artificial Intelligence (AI)
Mukul
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Ad

Building a Production-Ready Barts Health Secure Data Environment Tooling, Access Control, and Cost Governance_.pptx

  • 1. Barts Health Data Platform Author: Tony Wildish, Idowu Samuel Bioku, Evan Hann, Steven Newhouse, Benjamin Eaton, Ruzena Uddin, Francene Clarke-Walden.​ ​ Building a Production-Ready Barts Health Secure Data Environment: Tooling, Access Control, and Cost Governance Presenter: Idowu Samuel Bioku. STEP-UP RS London 2025
  • 2. Background Barts Health Data Platform?  The Barts Health Data Platform (BHDP) is an integrated health data analytics platform.  Provides a Data Portal for researchers to apply for access to research ready NHS patient data.  A Secure Data Environment (SDE) for researchers to find insights from the data they have been authorised to access.  Provides researchers and clinicians access to the data we hold on patients in East London. Why build a Secure Data Environment? SDE is crucial to health research with sensitive data. It satisfies the obligation to keep it safe, legally compliant, protected, and with the right tooling. ü Security and Integrity of Data. ü Build public trust over data handling. ü Compliance and Legal Requirements. ü Tooling to help researchers get started with significantly reduced overhead. ü Give researchers freedom to use wide variety of tools (VMs, SQL, AI/ML).
  • 3. The SDE is built upon Microsoft Azure's Trusted Research Environment (TRE) The AzureTRE is an open-source accelerator template that provides the core architecture for secure data environment. Benefits of AzureTRE ü Integration with the Microsoft Entra ID. ü Core Infrastructural templates (Terraform). ü Core security architecture. ü Flexibility and ease of integration with latest technologies. Need for production-level enhancements ü Support for Large-Scale Research: Barts Health's need to operate at scale of up to 150 concurrent projects to accommodate wider scope research. ü Ensuring Data Security and Compliance: As a leader in secure research environments, it’s critical to ensure data security and compliance. ü Operational Efficiency and Cost Control: Enhanced tooling for monitoring and resource management allows better cost optimization and resource allocation, keeping the platform sustainable as usage grows. ü Timely Support for Researchers: Technical support must scale to quickly resolve issues and minimize downtime.
  • 4. What we have done Secure Data Connection to Research Workspaces​  Secure Azure Data Factory linked service. ​  To enable secure transfer of approved research data.​ Cost transparency and management​  Granular project by project cost tracking over the full project lifetime (up to 3 years or more)​ .  Highly Visible and transparent costing. Research use cases demand secure, scalable, and customised environments. Custom VM images​ ü Pre-installed software and configurations tailored to research needs. ü Ensures consistency and faster provisioning across multiple projects and OS flavours. ü Regular updates as well as easy and fast deployment of new tools.
  • 5. Customised VM Images (1) The customised VM images include pre-installed and pre-configured set of software and configurations tailored to research needs. Custom Images: ü Windows 10 ü Windows 11 ü Windows Server 19 ü Ubuntu 22.04 ü Ubuntu 24.04 Pre-installed software, configuration and setup  Dev Tools: Python, Anaconda, R, RStudio, VS Code, DotNet, GIT.  Machine Learning Tools: Jupyter Notebooks, Azure Data Studio.  DB Tools: MySQL, PostgresSQL DBeaver, SSMS, Storage Explorer  DICOM viewers: Radiant, Spyder  General: Google Chrome, FireFox, LibreOffice
  • 6. Tailored for:​  Machine Learning​ • Pre-installed ML frameworks optimized for GPU and CPU workloads. • Integrated libraries for data preprocessing, feature engineering, and model deployment. • Configured with GPU for accelerated training.  Medical imaging​ • Includes specialised imaging software and toolkits e.g. DICOM viewers. Default Images Our Solution – Custom Images Deployment of new VMs takes time Faster deployment of new VMs Lack of research dependent software Availability of research dependent software Inability to control updates to images Can control updates to images and pre-installed software Why build custom images? Customised VM Images (2) Impact​ ü Faster Deployment.​ ü Consistent environments.​ ü Faster onboarding.​ ü Access to varieties of tools.​ ü Faster research.​
  • 7. Secure Data Connection to Research Workspace Each research workspace is configured to ensure data access and transfer from the Analysis Data Core.​ ​ Azure Data Factory:​ ü Enables automated and monitored data pipelines that transfer requested research data into the workspace.​ ü Uses encrypted connections and strict access controls to maintain data confidentiality and integrity during transit.​ ü Supports complex data transformation, validation, and orchestration workflows tailored to research needs.​ ​ Dedicated Private Endpoint:​ üEach workspace is provisioned with a dedicated private endpoint that provides a secure, private network connection specifically between Azure Data Factory (ADF) and the workspace’s data storage.​ üThis private endpoint ensures that data transferred by ADF pipelines flows entirely within Azure’s private network, eliminating exposure to the public internet and greatly enhancing data security during transit.​
  • 8. Default Azure Cost Management Portal Enhanced Cost Management Tooling Cost data has limited retention period High availability of cost data that even for decommissioned projects beyond the default retention period. Metadata of deleted resources has limited retention period Metadata of deleted resources are highly and readily available. Difficulties in mapping costs appropriately to their respective projects Solves the difficulties of mapping costs to their respective projects. Provides granular cost visibility of how much each resource costs in each project Cost Management Tooling​ ü The Cost Management is an Azure runbook-based solution.​ ü Runs weekly to aggregate cost data into AzureSQL.​ ü Helps researchers and TRE administrators to track resources costing, enabling smarter financial decisions.​ Why is the cost Management Tool Important?​ ​ Impact​ üProvides high visibility of expenses​ üHigh persistency even for completed or decommissioned projects​ üHelps with making smarter research cost decision​ üGranularity and transparent billing​ ​
  • 9. Ongoing Work: RBAC integration RBAC integration refers to embedding a permissions framework into the SDE that controls the sets of tools available to researchers. Why are we doing this? ü To streamline governance across a growing number of concurrent research workspaces. ü To enable precise control over tool access, ensuring only approved and secure software is available based on role. ü To reduce manual overhead in provisioning, updating, or removing tools across environments. How will it impact users?  Researchers will get simplified and relevant tool access, improving usability and focus.  Admins will be able to centrally manage access to tools and enforce compliance standards.  IT teams can roll out new tools, upgrades, or deprecations in a controlled, role-based way. How will it impact the SDE?  Allow SDE and workspace administrators to control the set of tools available to researchers.  Make upgrades and deprecation of tools easier to manage.
  • 10. Summary:  Our solution addresses the growing needs of Barts Health to support large-scale, secure, and compliant research environment.  Delivered a ready-to-use data platform with extended possibilities and features.  This has progressed in 2024 from Alpha in March, through Beta in June, to full production release in December. Our deliveries to date:  Custom VM Images tailored for machine learning, medical imaging, and complex health data workloads, ensuring optimized, secure, and consistent compute environments.  Secure Data Connection to Research Workspaces, featuring a secure Azure Data Factory integrations, and private endpoints together ensuring safe, compliant, and reliable data transfer pipelines.  Cost Management Tooling that provides transparency, cost granularity, and cost data persistency, helping to manage research budget efficiently.  Ongoing Production-Level Enhancements, including RBAC integration to provide fine-grained, role-based control over tool access. This aims to improve security, simplifying administration, governance, and enabling scalable operations.