SlideShare a Scribd company logo
OPEN
DATA
SCIENCE
CONFERENCE
London | Nov. 19 - Nov. 22 2019
a Life Science and AIOps Perspective
Towards Trustable AI for Complex Systems Research Fellow
Data Science Institute
Imperial College London
Xian Yang
Conclusion
Towards Trustable AI for Complex Systems
Make data trustable
Make a good understanding of
systems
Make AI algorithm trustable
Background
Overview
Ways to achieve
trustable AI
Background
Complex system: a system of systems
Complex life systems
Source: Wikipedia, GREATOPS
The signal transduction pathway in a
cell
The deployment diagram of a large-scale IT systemcomplex networks of biologically relevant entities
all related computer hardware, software, firmware, and
data for the communication, transmission, processing,
manipulation, storage, or protection of information.
Complex IT systems
Connected
Communicated
Complicated
Medical AI AIOps
Source: Itgsopedia, Riverbed
Missions of AI
Understand systems
Diagnose systems
Control systems
use of complex algorithms and software to emulate
human cognition in the analysis of complicated medical
data.
automate and enhance IT operations by
1) analyse big data collected from various tools and
devices via analytics and machine learning;
2) automatically spot and react to issues in real time.
AI for complex life systems and IT systems
AI for complex life systems and IT systems
AI based diagnosis
Embedding mapping
layer
Categorical feature
Classification
Layer
......Title
σ σ tanh
x
σ
x
x
+
tanh
σ σ tanh
x
σ
x
x
+
tanh
σ σ tanh
x
σ
x
x
+
tanh
Description i-1 Description i Description i+1
Output:
Failure type
Input:
Failure description
Input: Engineers’ discussion
Input:
Failure’s characteristics
Disease diagnosis Failure diagnosis
Source: ReferralMD
Medical AI AIOps
Elements of AI in Complex systems
Surgical
robot,
AI CT scan
reader,
AI nurse
Pathologic
analysis,
Efficacy
analysis
Clinical
pathway
optimization,
Hospital bed
management,
Disease
prevention
AIOps
component
library,
Intelligent
prediction,
Chatbot
Anomaly
detection
& prediction,
Root Cause
Analysis
Performance
Optimization,
Defragmentati
on,
Cost analysis,
Capacity
management
Efficiency Quality Cost
Data
Standardizatio
n
Data
Acquisition
Data Channel
Data Cleaning,
ETL,
Meta Data
Management,
Offline
Computation
Realtime
Computation
Feature
Engineering
Efficiency Quality Cost
Medical AI AIOps
AI
Applications
Big Data
Platform
Regression
analysis
Time series
analysis
Causal
inference
Dynamical model
construction
Correlation
analysis
Differential
feature selection
Cluster
analysis
Component
analysis
AI algorithms
Categories of AI algorithms
Ease of experiments
Power of explanation
My focus: Trustable AI for complex systems
What's AI’s
holdup?
It is not technical
The barrier is the trust aspects.
If it is not trustable, then it is
not useful.
Working towards trustable AI.
How to get humans to use our
AI technology and rely on it?
My
focus
Ways to achieve
trustable AI
1.1. Deeper 1.2. Wider 1.3. Bigger
2.1. A holistic view: Simplification VS. Complication
3.1. Moving beyond correlation 3.2. Moving beyond AI black-box
Make data trustable
Make a good understanding of systems
Make AI algorithm trustable
1.
2.
3.
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Deeper: Extracting detailed information from the data1.1
Make data trustable
Phenotype annotation
from electronic health record
EXAMPLE
Cohort selection
for precision
medicine
Use as the
inputs of the
AI model
extracted
features
Extract information from
raw data
Problem:
Disease code cannot fully represent medical information in electronic health records (EHRs).
Case study: two patients with the same ICD code have different level of severity.
Example: Phenotype annotation from electronic health record
Case 1 admission_id=198908, subject_id=28912
brief hospital course the patient was seen in the emergency room at the
request of neurology and the emergency room staff at doctor last name
family she received a dilantin load on arrival to hospital a ct was obtained
which showed subarachnoid blood the patient was neurologically intact
except for some mild confusion about her location stating she was still at
doctor last name family hospital her films were reviewed by the neurosurgery
staff and the decision was made to take her for cerebral angiogram the
angiogram showed an aneurysm at the vertebral basilar junction mm to mm
it was unable to be treated endovascularly an open repair was considered to
be complicated given the location and the patients age dr first name stitle
elected to transfer the patient to hospital to dr last name stitle for further
evaluation and possible intervention
Case 2 admission_id=188170, subject_id=56707
brief hospital course patient was admitted to neurosurgery on for further
management she underwent the above stated procedure please review
dictated operative report for details she had a negative angiogram and was
to intensive care unit in stable condition she had an uncomplicated intensive
care unit course and was transferred to floor in stable condition throughout
her hospital course she remained neurologically stable and intact she
complained of a mild headache that worsens when she sits up and walks
around now dod patient is vss and neurologically stable patient s pain is well
controlled and the patient is tolerating a good oral diet pt s incision is clean
dry and inctact without evidence of infection patient is ambulating without
issues she is set for discharge home in stable condition and will follow up in
month for mri a brain with dr first name stitle
ICD 430: Subarachnoid hemorrhage
Find EHRs that diagnosed as 430 ONLY, Phenotypes (including severity) in red
Phenotype (HPO) terms can better characterize patients by providing deeper information .
admission_id=188170, subject_id=56707
"brief hospital course patient was admitted to neurosurgery on for further management she underwent the above stated procedure please review dictated
operative report for details she had a negative angiogram and was to intensive care unit in stable condition she had an uncomplicated intensive care unit course
and was transferred to floor in stable condition throughout her hospital course she remained neurologically stable and intact she complained of a mild
headache that worsens when she sits up and walks around now dod patient is vss and neurologically stable patient s pain is well controlled and the patient
is tolerating a good oral diet pt s incision is clean dry and inctact without evidence of infection patient is ambulating without issues she is set for discharge
home in stable condition and will follow up in month for mri a brain with dr first name stitle"
Example: Phenotype annotation from electronic health record
• Problem:
HPO terms cannot be fully found using the keyword search method: synonyms and
implicit information
• Solution:
apply AI to do automatic phenotype annotation
unsupervised
learning with no
labelled data
ICD 430: Subarachnoid hemorrhage
Synonyms: subarachnoid blood == subarachnoid hemorrhage
Synonyms: vertebral basilar == vertebrobasilar
Implicit information: terms in blue
Example: Phenotype annotation from electronic health record
J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE
International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
There are two types of data sources.
: a collection of EHRs and each EHR consists of textual notes written by clinicians.
: a standardized general category of human phenotypic abnormalities provided by HPO.
The HPO also provides additional subclasses
Example: Phenotype annotation from electronic health record
J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE
International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
Assumptions
The semantics of a general phenotype is represented by a prior distribution. The prior distribution of each phenotype should be ‘distinct’ enough from each other.
The semantics of EHR is a composition of the semantics of phenotypes.
Represented by
Generated by
Sampled from
phenotype vector prior
‘Distinct’ enough other priors
Sampled from
Represented by
Generated by
Example: Phenotype annotation from electronic health record
J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE
International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
Loss 1: text reconstruction of EHRs. Loss 2: text reconstruction of the general phenotypic
abnormalities.
Loss 3: text reconstruction of the phenotype subclasses.
Loss 4: the latent vectors sampled from different priors can be
classified to different classes, then the priors are thought to be
‘distinct’ enough.
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Wider: Integrating multi-modal data1.2
Make data trustable
Pan-cancer Classification
based on Multi-Omics analysis
EXAMPLE
…
Combining
Multimodal
Combine Data from different modalities
Provide a
comprehensive
view of patients
Make
more accurate
clinical decision
Example: Pan-cancer
Classification based on
Multi-Omics analysis
Our method:
We combine the
variational
autoencoder with a
classification network
to achieve task-
oriented feature
extraction and multi-
class classification.
Inputs
Outputs
X. Zhang, J. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification”, (short paper) 2019 IEEE International
Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Bigger: Augmenting data1.3
Make data trustable
Augmented
Data
Augment data for limited samples
Increase the volume
of training samples
• Rare disease study
• System failure study
Imbalance
d Data
EXAMPLE
Synthetic medical image
augmentation
Example: Synthetic medical image augmentation
Frid-Adar, Maayan, et al. "GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification." Neurocomputing 321 (2018): 321-331.
Traditional image augmentation methods:
• Flip: flip images horizontally and vertically
• Rotation: rotate images by angles
• Scale: scale images outward or inward
• Crop: randomly sample a section from the
original image
• Translation: move the image along the X or Y
direction
• Gaussian Noise: add noise to enhance the
learning capability
Advanced image augmentation method:
• GAN
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
A holistic view: Investigate a complex system as a whole2
Make a good understanding of systems
Biological
system
System of systems
Cloud
System
Investigate system in
a holistic view
Study all
anomaly/failure signals
across the whole
system
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Simplification VS. Complication: Keep balance in between2
Make a good understanding of systems
Inferring model for
large-scale biological network
EXAMPLEComplicatio
n
Simplification
Model networks of
complex systems for
better understanding
Effective modeling of entities
and inter-connections in
large scale systems
Example: Inferring large-scale biological network
Simplification:
Easy to model
Hard to mimic the real
behavior of system
Complication:
Hard to model
Good to mimic the real
behavior of system
Problem: How to make a balance between simplification and complication?
A. Holehouse, X. Yang, I. Adcock, and Y. Guo, “Developing a novel integrated model of p38
MAPK and glucocorticoid signalling pathways”, Computational Intelligence and Bioinformatics
and Computational Biology (CIBCB), 2012.
Example: Inferring large-scale biological network
Key observations of large-scale
networks
• Separation of timescales
Sparsity of variations
• Cross-reactivity
Combined-measurement
Prerequisites of sparse signal
recovery
• The signal is sparse in some domain
• A measurement is a weighted linear
combination of several points of the signal
My suggestion:
• We can study the complex network under different timescales.
• Within each time scale, only some entities have dynamic changes.
• Thus, we can apply sparse learning to infer a model under each timescale.
• Then, we combine all models obtained from all timescales together.
L. Nie, X. Yang, I. Adcock, Z. Xu, and Y. Guo, “Inferring cell-scale signalling
networks via compressive sensing”, PLoS One, vol. 9, no. 4, 2014.
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Moving beyond correlation: Explore towards causation3.1
Make AI algorithm trustable
A signal’s predictive power does not necessarily imply that the
signal is actually related to or explains the phenomena being
predicted.
Moving from correlation to causation is especially important for
understanding :
what are the conditions under which it may fail?
how long we can expect it to be predictive?
how widely applicable it may be?
For an AI model…
• Deriving functional connectivity from
Brain fMRI data
• Root cause diagnosis for system
failures
APPLICATIONS
Correlation Causation
the covariance of the two
variables divided by the
product of their standard
deviations.
Fast computation of minimum
partial correlation
The choice of a hyperparameter,
the significant threshold, greatly
influences the results.
The minimum of all absolute values
of partial correlations by controlling
on all possible subsets of other
nodes
Remove indirect relationship
L. Nie, X. Yang, P. M. Matthews, Z. Xu, and Y. Guo, “Inferring functional connectivity in fMRI using minimum
partial correlation”, International Journal of Automation and Computing, 2017.
Causation
Correlation
measures the degree
of association between
two random variables, with the
effect of a set of controlling
random variables removed.
Partial correlation
Pearson correlation
PC algorithm
Minimum partial correlation
Automatically increase the significant threshold
within a given time limit to maximally
approach the minimum partial correlation.
Avoid repeating partial correlation done
previously with lower significant threshold.
Elastic PC algorithm
1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable
Moving beyond AI black-box: Explore towards explainable AI3.2
Make AI algorithm trustable
Explanability is the process of giving explanations to Human
Why we need Explainable AI?
Demands from industry and society
Desires of human brain
My future research direction: towards AI algorithm explainability
Black-box Explainable
Towards AI algorithm explainability
AUTO FEATURE ENGINEERING
COMBINATION OF
PHYSICS/MATH/TRADITIONAL ML
MODELS WITH ADVANCED DL
LEARNING LAWS FROM DATA
RATHER THAN CURVE FITTING
Automatically generate
explainable features in the
model construction process
Features: height, weight -> Label:
health []
Features: height, weight, BMI =
w/h2 -> Label: health []
Conclusion
There is still a long way to go before fully trustable AI.
We must work on it now!
The future of AI for complex systems depends on trust.
To achieve this, we need to work beyond the algorithm
aspect and work on
1) trustable data,
2) good understanding of systems
3) trustable AI algorithms.
Moving towards causation is crucial for making AI
algorithms trustable.
Thanks
Xian YangTowards Trustable AI for Complex Systems: a Life Science
and AIOps Perspective
Q&A

More Related Content

Similar to Towards Trustable AI for Complex Systems (20)

PPTX
Validation of Clinical Artificial Intelligence: Where We Are and Where We Are...
Sean Manion PhD
 
PDF
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
MLconf
 
PDF
ML to cure the world
Xavier Amatriain
 
PPT
Simplifying semantics for biomedical applications
Semantic Web San Diego
 
PDF
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
PDF
Deep learning for biomedical discovery and data mining II
Deakin University
 
PDF
台灣人工智慧學校南部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
AI.academy
 
PDF
Artificial Intelligence in Medicine
Sven Van Poucke, MD, PhD
 
PPTX
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
PPTX
HXR 2016: Data Insights: Mining, Modeling, and Visualizations- Niraj Katwala
HxRefactored
 
PDF
Powering Biomedical Artificial Intelligence with a Holistic Knowledge Graph (...
Catia Pesquita
 
PDF
Artificial intelligence in healthcare past,present and future
Errepe
 
DOC
Disease inference from health-related uestions vissparse deep learning
vishnuRajan20
 
DOC
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
vishnuRajan20
 
PDF
AI for Precision Medicine (Pragmatic preclinical data science)
Paul Agapow
 
PDF
PhD Defense - Knowledge graphs based extension of patients’ files to predict ...
Raphaël Gazzotti
 
PPTX
REVIEW-1ofstrokforecastingstrokefore.pptx
sne526718
 
PDF
Towards online universal quality healthcare through AI
Xavier Amatriain
 
PDF
"Toward Generating Domain-specific / Personalized Problem Lists from Electron...
diannepatricia
 
PPTX
Aleksandar Zivaljevic - Annotation of clinical datasets using openEHR Archety...
Health Informatics New Zealand
 
Validation of Clinical Artificial Intelligence: Where We Are and Where We Are...
Sean Manion PhD
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
MLconf
 
ML to cure the world
Xavier Amatriain
 
Simplifying semantics for biomedical applications
Semantic Web San Diego
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Deep learning for biomedical discovery and data mining II
Deakin University
 
台灣人工智慧學校南部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
AI.academy
 
Artificial Intelligence in Medicine
Sven Van Poucke, MD, PhD
 
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
HXR 2016: Data Insights: Mining, Modeling, and Visualizations- Niraj Katwala
HxRefactored
 
Powering Biomedical Artificial Intelligence with a Holistic Knowledge Graph (...
Catia Pesquita
 
Artificial intelligence in healthcare past,present and future
Errepe
 
Disease inference from health-related uestions vissparse deep learning
vishnuRajan20
 
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
vishnuRajan20
 
AI for Precision Medicine (Pragmatic preclinical data science)
Paul Agapow
 
PhD Defense - Knowledge graphs based extension of patients’ files to predict ...
Raphaël Gazzotti
 
REVIEW-1ofstrokforecastingstrokefore.pptx
sne526718
 
Towards online universal quality healthcare through AI
Xavier Amatriain
 
"Toward Generating Domain-specific / Personalized Problem Lists from Electron...
diannepatricia
 
Aleksandar Zivaljevic - Annotation of clinical datasets using openEHR Archety...
Health Informatics New Zealand
 

More from HPCC Systems (20)

PPTX
Natural Language to SQL Query conversion using Machine Learning Techniques on...
HPCC Systems
 
PPT
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
HPCC Systems
 
PPTX
Welcome
HPCC Systems
 
PPTX
Closing / Adjourn
HPCC Systems
 
PPTX
Community Website: Virtual Ribbon Cutting
HPCC Systems
 
PPTX
Path to 8.0
HPCC Systems
 
PPTX
Release Cycle Changes
HPCC Systems
 
PPTX
Geohashing with Uber’s H3 Geospatial Index
HPCC Systems
 
PPTX
Advancements in HPCC Systems Machine Learning
HPCC Systems
 
PPTX
Docker Support
HPCC Systems
 
PPTX
Expanding HPCC Systems Deep Neural Network Capabilities
HPCC Systems
 
PPTX
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 
PPTX
DataPatterns - Profiling in ECL Watch
HPCC Systems
 
PPTX
Leveraging the Spark-HPCC Ecosystem
HPCC Systems
 
PPTX
Work Unit Analysis Tool
HPCC Systems
 
PPTX
Community Award Ceremony
HPCC Systems
 
PPTX
Dapper Tool - A Bundle to Make your ECL Neater
HPCC Systems
 
PPTX
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
HPCC Systems
 
PPTX
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
HPCC Systems
 
PPTX
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
HPCC Systems
 
Natural Language to SQL Query conversion using Machine Learning Techniques on...
HPCC Systems
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
HPCC Systems
 
Welcome
HPCC Systems
 
Closing / Adjourn
HPCC Systems
 
Community Website: Virtual Ribbon Cutting
HPCC Systems
 
Path to 8.0
HPCC Systems
 
Release Cycle Changes
HPCC Systems
 
Geohashing with Uber’s H3 Geospatial Index
HPCC Systems
 
Advancements in HPCC Systems Machine Learning
HPCC Systems
 
Docker Support
HPCC Systems
 
Expanding HPCC Systems Deep Neural Network Capabilities
HPCC Systems
 
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 
DataPatterns - Profiling in ECL Watch
HPCC Systems
 
Leveraging the Spark-HPCC Ecosystem
HPCC Systems
 
Work Unit Analysis Tool
HPCC Systems
 
Community Award Ceremony
HPCC Systems
 
Dapper Tool - A Bundle to Make your ECL Neater
HPCC Systems
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
HPCC Systems
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
HPCC Systems
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
HPCC Systems
 
Ad

Recently uploaded (20)

PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Python basic programing language for automation
DanialHabibi2
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Ad

Towards Trustable AI for Complex Systems

  • 2. a Life Science and AIOps Perspective Towards Trustable AI for Complex Systems Research Fellow Data Science Institute Imperial College London Xian Yang
  • 3. Conclusion Towards Trustable AI for Complex Systems Make data trustable Make a good understanding of systems Make AI algorithm trustable Background Overview Ways to achieve trustable AI
  • 5. Complex system: a system of systems Complex life systems Source: Wikipedia, GREATOPS The signal transduction pathway in a cell The deployment diagram of a large-scale IT systemcomplex networks of biologically relevant entities all related computer hardware, software, firmware, and data for the communication, transmission, processing, manipulation, storage, or protection of information. Complex IT systems Connected Communicated Complicated
  • 6. Medical AI AIOps Source: Itgsopedia, Riverbed Missions of AI Understand systems Diagnose systems Control systems use of complex algorithms and software to emulate human cognition in the analysis of complicated medical data. automate and enhance IT operations by 1) analyse big data collected from various tools and devices via analytics and machine learning; 2) automatically spot and react to issues in real time. AI for complex life systems and IT systems
  • 7. AI for complex life systems and IT systems AI based diagnosis Embedding mapping layer Categorical feature Classification Layer ......Title σ σ tanh x σ x x + tanh σ σ tanh x σ x x + tanh σ σ tanh x σ x x + tanh Description i-1 Description i Description i+1 Output: Failure type Input: Failure description Input: Engineers’ discussion Input: Failure’s characteristics Disease diagnosis Failure diagnosis Source: ReferralMD Medical AI AIOps
  • 8. Elements of AI in Complex systems Surgical robot, AI CT scan reader, AI nurse Pathologic analysis, Efficacy analysis Clinical pathway optimization, Hospital bed management, Disease prevention AIOps component library, Intelligent prediction, Chatbot Anomaly detection & prediction, Root Cause Analysis Performance Optimization, Defragmentati on, Cost analysis, Capacity management Efficiency Quality Cost Data Standardizatio n Data Acquisition Data Channel Data Cleaning, ETL, Meta Data Management, Offline Computation Realtime Computation Feature Engineering Efficiency Quality Cost Medical AI AIOps AI Applications Big Data Platform Regression analysis Time series analysis Causal inference Dynamical model construction Correlation analysis Differential feature selection Cluster analysis Component analysis AI algorithms
  • 9. Categories of AI algorithms Ease of experiments Power of explanation
  • 10. My focus: Trustable AI for complex systems What's AI’s holdup? It is not technical The barrier is the trust aspects. If it is not trustable, then it is not useful. Working towards trustable AI. How to get humans to use our AI technology and rely on it? My focus
  • 11. Ways to achieve trustable AI 1.1. Deeper 1.2. Wider 1.3. Bigger 2.1. A holistic view: Simplification VS. Complication 3.1. Moving beyond correlation 3.2. Moving beyond AI black-box Make data trustable Make a good understanding of systems Make AI algorithm trustable 1. 2. 3.
  • 12. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Deeper: Extracting detailed information from the data1.1 Make data trustable Phenotype annotation from electronic health record EXAMPLE Cohort selection for precision medicine Use as the inputs of the AI model extracted features Extract information from raw data
  • 13. Problem: Disease code cannot fully represent medical information in electronic health records (EHRs). Case study: two patients with the same ICD code have different level of severity. Example: Phenotype annotation from electronic health record Case 1 admission_id=198908, subject_id=28912 brief hospital course the patient was seen in the emergency room at the request of neurology and the emergency room staff at doctor last name family she received a dilantin load on arrival to hospital a ct was obtained which showed subarachnoid blood the patient was neurologically intact except for some mild confusion about her location stating she was still at doctor last name family hospital her films were reviewed by the neurosurgery staff and the decision was made to take her for cerebral angiogram the angiogram showed an aneurysm at the vertebral basilar junction mm to mm it was unable to be treated endovascularly an open repair was considered to be complicated given the location and the patients age dr first name stitle elected to transfer the patient to hospital to dr last name stitle for further evaluation and possible intervention Case 2 admission_id=188170, subject_id=56707 brief hospital course patient was admitted to neurosurgery on for further management she underwent the above stated procedure please review dictated operative report for details she had a negative angiogram and was to intensive care unit in stable condition she had an uncomplicated intensive care unit course and was transferred to floor in stable condition throughout her hospital course she remained neurologically stable and intact she complained of a mild headache that worsens when she sits up and walks around now dod patient is vss and neurologically stable patient s pain is well controlled and the patient is tolerating a good oral diet pt s incision is clean dry and inctact without evidence of infection patient is ambulating without issues she is set for discharge home in stable condition and will follow up in month for mri a brain with dr first name stitle ICD 430: Subarachnoid hemorrhage Find EHRs that diagnosed as 430 ONLY, Phenotypes (including severity) in red Phenotype (HPO) terms can better characterize patients by providing deeper information .
  • 14. admission_id=188170, subject_id=56707 "brief hospital course patient was admitted to neurosurgery on for further management she underwent the above stated procedure please review dictated operative report for details she had a negative angiogram and was to intensive care unit in stable condition she had an uncomplicated intensive care unit course and was transferred to floor in stable condition throughout her hospital course she remained neurologically stable and intact she complained of a mild headache that worsens when she sits up and walks around now dod patient is vss and neurologically stable patient s pain is well controlled and the patient is tolerating a good oral diet pt s incision is clean dry and inctact without evidence of infection patient is ambulating without issues she is set for discharge home in stable condition and will follow up in month for mri a brain with dr first name stitle" Example: Phenotype annotation from electronic health record • Problem: HPO terms cannot be fully found using the keyword search method: synonyms and implicit information • Solution: apply AI to do automatic phenotype annotation unsupervised learning with no labelled data ICD 430: Subarachnoid hemorrhage Synonyms: subarachnoid blood == subarachnoid hemorrhage Synonyms: vertebral basilar == vertebrobasilar Implicit information: terms in blue
  • 15. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 There are two types of data sources. : a collection of EHRs and each EHR consists of textual notes written by clinicians. : a standardized general category of human phenotypic abnormalities provided by HPO. The HPO also provides additional subclasses
  • 16. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 Assumptions The semantics of a general phenotype is represented by a prior distribution. The prior distribution of each phenotype should be ‘distinct’ enough from each other. The semantics of EHR is a composition of the semantics of phenotypes. Represented by Generated by Sampled from phenotype vector prior ‘Distinct’ enough other priors Sampled from Represented by Generated by
  • 17. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 Loss 1: text reconstruction of EHRs. Loss 2: text reconstruction of the general phenotypic abnormalities. Loss 3: text reconstruction of the phenotype subclasses. Loss 4: the latent vectors sampled from different priors can be classified to different classes, then the priors are thought to be ‘distinct’ enough.
  • 18. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Wider: Integrating multi-modal data1.2 Make data trustable Pan-cancer Classification based on Multi-Omics analysis EXAMPLE … Combining Multimodal Combine Data from different modalities Provide a comprehensive view of patients Make more accurate clinical decision
  • 19. Example: Pan-cancer Classification based on Multi-Omics analysis Our method: We combine the variational autoencoder with a classification network to achieve task- oriented feature extraction and multi- class classification. Inputs Outputs X. Zhang, J. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification”, (short paper) 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
  • 20. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Bigger: Augmenting data1.3 Make data trustable Augmented Data Augment data for limited samples Increase the volume of training samples • Rare disease study • System failure study Imbalance d Data EXAMPLE Synthetic medical image augmentation
  • 21. Example: Synthetic medical image augmentation Frid-Adar, Maayan, et al. "GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification." Neurocomputing 321 (2018): 321-331. Traditional image augmentation methods: • Flip: flip images horizontally and vertically • Rotation: rotate images by angles • Scale: scale images outward or inward • Crop: randomly sample a section from the original image • Translation: move the image along the X or Y direction • Gaussian Noise: add noise to enhance the learning capability Advanced image augmentation method: • GAN
  • 22. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable A holistic view: Investigate a complex system as a whole2 Make a good understanding of systems Biological system System of systems Cloud System Investigate system in a holistic view Study all anomaly/failure signals across the whole system
  • 23. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Simplification VS. Complication: Keep balance in between2 Make a good understanding of systems Inferring model for large-scale biological network EXAMPLEComplicatio n Simplification Model networks of complex systems for better understanding Effective modeling of entities and inter-connections in large scale systems
  • 24. Example: Inferring large-scale biological network Simplification: Easy to model Hard to mimic the real behavior of system Complication: Hard to model Good to mimic the real behavior of system Problem: How to make a balance between simplification and complication? A. Holehouse, X. Yang, I. Adcock, and Y. Guo, “Developing a novel integrated model of p38 MAPK and glucocorticoid signalling pathways”, Computational Intelligence and Bioinformatics and Computational Biology (CIBCB), 2012.
  • 25. Example: Inferring large-scale biological network Key observations of large-scale networks • Separation of timescales Sparsity of variations • Cross-reactivity Combined-measurement Prerequisites of sparse signal recovery • The signal is sparse in some domain • A measurement is a weighted linear combination of several points of the signal My suggestion: • We can study the complex network under different timescales. • Within each time scale, only some entities have dynamic changes. • Thus, we can apply sparse learning to infer a model under each timescale. • Then, we combine all models obtained from all timescales together. L. Nie, X. Yang, I. Adcock, Z. Xu, and Y. Guo, “Inferring cell-scale signalling networks via compressive sensing”, PLoS One, vol. 9, no. 4, 2014.
  • 26. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Moving beyond correlation: Explore towards causation3.1 Make AI algorithm trustable A signal’s predictive power does not necessarily imply that the signal is actually related to or explains the phenomena being predicted. Moving from correlation to causation is especially important for understanding : what are the conditions under which it may fail? how long we can expect it to be predictive? how widely applicable it may be? For an AI model… • Deriving functional connectivity from Brain fMRI data • Root cause diagnosis for system failures APPLICATIONS Correlation Causation
  • 27. the covariance of the two variables divided by the product of their standard deviations. Fast computation of minimum partial correlation The choice of a hyperparameter, the significant threshold, greatly influences the results. The minimum of all absolute values of partial correlations by controlling on all possible subsets of other nodes Remove indirect relationship L. Nie, X. Yang, P. M. Matthews, Z. Xu, and Y. Guo, “Inferring functional connectivity in fMRI using minimum partial correlation”, International Journal of Automation and Computing, 2017. Causation Correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. Partial correlation Pearson correlation PC algorithm Minimum partial correlation Automatically increase the significant threshold within a given time limit to maximally approach the minimum partial correlation. Avoid repeating partial correlation done previously with lower significant threshold. Elastic PC algorithm
  • 28. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Moving beyond AI black-box: Explore towards explainable AI3.2 Make AI algorithm trustable Explanability is the process of giving explanations to Human Why we need Explainable AI? Demands from industry and society Desires of human brain My future research direction: towards AI algorithm explainability Black-box Explainable
  • 29. Towards AI algorithm explainability AUTO FEATURE ENGINEERING COMBINATION OF PHYSICS/MATH/TRADITIONAL ML MODELS WITH ADVANCED DL LEARNING LAWS FROM DATA RATHER THAN CURVE FITTING Automatically generate explainable features in the model construction process Features: height, weight -> Label: health [] Features: height, weight, BMI = w/h2 -> Label: health []
  • 30. Conclusion There is still a long way to go before fully trustable AI. We must work on it now! The future of AI for complex systems depends on trust. To achieve this, we need to work beyond the algorithm aspect and work on 1) trustable data, 2) good understanding of systems 3) trustable AI algorithms. Moving towards causation is crucial for making AI algorithms trustable.
  • 31. Thanks Xian YangTowards Trustable AI for Complex Systems: a Life Science and AIOps Perspective
  • 32. Q&A

Editor's Notes

  • #6: Source: Wikipedia https://en.wikipedia.org/wiki/File:Signal_transduction_pathways.png#filelinks GREATOPS http://www.gaowei.vip/m/Library/detail?no=94991143
  • #7: Riverbed https://www.riverbed.com/faq/what-is-aiops.html
  • #8: ReferralMD https://getreferralmd.com/2018/12/big-data-and-ai-in-healthcare-marketing/