Data Quality and
Uncertainty Visualization
UC San Diego
COGS 220
Winter Quarter 2006
Barry Demchak
Immediate Motivation: Wiisard
 A joint project of Veterans Administration and
UC San Diego, funded by the National Library
of Medicine
 Mass casualty triage and treatment
 Enter patient information via PDAs
 Patient information summarized on tablet PCs
 Command/control for supervisors and incident
comment personnel
 Tied together using 802.11b and store-and-
forward database access
Wiisard – Explosion with Pesticides
Wiisard – Network Deployment
Wiisard – Tablet Display
Wiisard – Command/Control
Wiisard – The Problem
 What if the network becomes partitioned?
 Tablet display shows out-of-date patient
information
 Summary displays are out of date, too
 How does this lead to bad decisions?
 Supervisors may mis-deploy doctors
 Incident command may mis-deploy resources
People may die
DOD Example
 Sensor-to-shooter (STS) Networks – Patrick
Driscoll (USMA), June 2002
DOD Example
DOD Example
 “… our first attempt to get the military
community to realize that there is a degree of
uncertainty involved in (digital) information
systems that cannot be engineered out of the
system.”
 “Ultimately, our concern was an awareness
issue (for the decision maker) …”
 “… woman at MITRE had proposed a system
of tagging intelligence starting at the source
in a way that would reflect the uncertainty of
the data being put into the intel database.”
The Problem
 How to visualize the uncertainty in data so
that humans can exercise judgment in
making the best decision
 Accounting for uncertainty is not the same
thing as visualizing uncertainty
What Labs are Involved
 MIT Sloan School of Management
 Richard Wang (Data Quality)
 Penn State University
 Alan MacEachren (GIS)
 University of Maine
 Kate Beard-Tisdale (GIS)
 University of California, Santa Cruz
 Alex Pang (Scientific Visualization)
 University of Arkansas, Little Rock
 Master of Sciences in Information Quality
What Conferences are There?
 MIT Information Quality (IQatMIT)
 ACM SIGMOD Workshop on Information Quality in
 ACM SIGKDD (Knowledge Discovery and Data M
 MIT International Conference on Information Qual
Semiotic Interpretation
Definition of Data Quality
 From Wand & Wang:
Metrics
 Timeliness
 How up to date relative to intended purpose
 Ballou et al:
 Timeliness = Max(0, 1-(currency/volatility)
 Currency = delivery_time – input_time
 Volatility = length of time data remains valid
 Apply sensitivity factor “s”: Timeliness ^ s
Interplay with Uncertainty
 Metrics are application dependent
 Metrics are data dependent
 Metrics are user dependent
 Question: If a metric describes an individual
data element, what is the effect of
aggregating data elements having
uncertainty??
GIS Examples – NCGIA
 Sample point locations
as overlay
 Sample points and
corresponding contours
using naïve shading
GIS Examples – NCGIA
 Gray shading
uncertainty surface
captures distance
function used by
interpolation method
 Uncertainty encoded in
contour line widths
 Fill Clarity
 Resolution
GIS Techniques
 Contour Crispness
 Fog
Merging Data and Uncertainty
 Risk and
uncertainty
separately
 Risk and
uncertainty
combined
Basic Data Examples
 Errors
Basic Data Examples
 Errors
Basic Data Examples
 Ambiguation
Basic Data Examples
 Ambiguation
Photo Realistic
Uncertainty Vector Glyphs
Uncertainty Vector Glyphs
Hue as Uncertainty
 With
out
 With
Texture as Uncertainty
Raw
Trans-
parent
Points
Cer-
tain-
ty
Opaque
Lines
 Data Confidence
 x is a device, α is decay constant, R(x) is a
weighting for device x in the calculation
Back to Wiisard
∑












•




 −
+
=
x
xpingtime
xposttimecurtime
xRC
α
)(
)(
1
1
)(
Back to Wiisard
 Individual data (annotation)
 Aggregate data (annotated/integrated)
Back to Wiisard
 Annotated
Back to Wiisard
 Integrated
Research Questions
 What are the dimensions of metrics relevant
for determining data quality for medical
providers in a mass casualty context?
 What kind of visualization best conveys the
use suitability for various kinds of data?
 Single data points
 Streaming bioinformation
 Aggregated information
Research Questions
 What kinds of visualizations are best suited to
field personnel?
 Non-IS frenzied technicians
 High glare, small footprint screens
 Low processing power
 What kinds of visualizations are best suited to
incident command?
 Seasoned experts
 Large, high density displays
 Highly connected with high data processing
Conclusion
 Data Quality and Uncertainty Visualization
are like the weather …
… everyone’s talks about it, but no one
does anything about it

More Related Content

PPT
Provinance in scientific workflows in e science
PPTX
Introduction to Visualizing Uncertainties
PDF
Toward supporting decision-making under uncertainty in digital humanities wit...
PDF
Enhance The Technique For Searching Dimension Incomplete Databases
PDF
A Novel Integrated Framework to Ensure Better Data Quality in Big Data Analyt...
PDF
Supervised Multi Attribute Gene Manipulation For Cancer
DOCX
Providing healthcare as-a-service using fuzzy rule-based big data analytics i...
PDF
Anonymization of data using mapreduce on cloud
Provinance in scientific workflows in e science
Introduction to Visualizing Uncertainties
Toward supporting decision-making under uncertainty in digital humanities wit...
Enhance The Technique For Searching Dimension Incomplete Databases
A Novel Integrated Framework to Ensure Better Data Quality in Big Data Analyt...
Supervised Multi Attribute Gene Manipulation For Cancer
Providing healthcare as-a-service using fuzzy rule-based big data analytics i...
Anonymization of data using mapreduce on cloud

What's hot (19)

PDF
Making an impact with data science
PDF
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
PDF
Machine Learning part 3 - Introduction to data science
PDF
hariri2019.pdf
PDF
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
PDF
How to crack down big data?
PPTX
Learning from imbalanced data
PDF
Principles of Software-defined Elastic Systems for Big Data Analytics
PPTX
Data analytics in computer networking
PDF
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
PPTX
A Dynamic Systems Approach to Production Management in the Automotive Industry
PPTX
eResearch AU 2015, intro slides
PDF
EFFICIENT INDEX FOR A VERY LARGE DATASETS WITH HIGHER DIMENSION
PDF
598_RamaSrikanthJakkam_CEE
PDF
Data Mining and Knowledge Management
PPTX
presentationIDC - 14MAY2015
PDF
Certain Investigation on Dynamic Clustering in Dynamic Datamining
PPTX
Feature selection with imbalanced data in agriculture
Making an impact with data science
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
Machine Learning part 3 - Introduction to data science
hariri2019.pdf
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
How to crack down big data?
Learning from imbalanced data
Principles of Software-defined Elastic Systems for Big Data Analytics
Data analytics in computer networking
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A Dynamic Systems Approach to Production Management in the Automotive Industry
eResearch AU 2015, intro slides
EFFICIENT INDEX FOR A VERY LARGE DATASETS WITH HIGHER DIMENSION
598_RamaSrikanthJakkam_CEE
Data Mining and Knowledge Management
presentationIDC - 14MAY2015
Certain Investigation on Dynamic Clustering in Dynamic Datamining
Feature selection with imbalanced data in agriculture
Ad

Viewers also liked (14)

PPT
Probabilistic Risk Assessment
PPTX
Using sentence compression to develop visual analytics - VISLA15
PPTX
Visualization of Uncertain Information in Tag Clouds
PPTX
Visualizing Uncertainty in the Prediction of Academic Risk
PDF
phd-thesis
PPT
10,00 Modelling and analysis of geophysical data using geostatistics and mach...
PPTX
Benchmarking Learning Analytics in Australia
PPTX
Visualizacion de datos
PPTX
Visual perception and mixed-initiative interaction for assisted visualization...
PPTX
Monte carlo presentation for analysis of business growth
PPTX
Software Citation, Reuse and Metadata Considerations: An Exploratory Study ...
PPTX
Tools for Assessment and Mapping of Natural Hazard Risks, Michael BRUENDL
PPT
Risk And Uncertainty Lecture 2
PPTX
Hazards Risk and Perils, a complete explanation .
Probabilistic Risk Assessment
Using sentence compression to develop visual analytics - VISLA15
Visualization of Uncertain Information in Tag Clouds
Visualizing Uncertainty in the Prediction of Academic Risk
phd-thesis
10,00 Modelling and analysis of geophysical data using geostatistics and mach...
Benchmarking Learning Analytics in Australia
Visualizacion de datos
Visual perception and mixed-initiative interaction for assisted visualization...
Monte carlo presentation for analysis of business growth
Software Citation, Reuse and Metadata Considerations: An Exploratory Study ...
Tools for Assessment and Mapping of Natural Hazard Risks, Michael BRUENDL
Risk And Uncertainty Lecture 2
Hazards Risk and Perils, a complete explanation .
Ad

Similar to Data quality and uncertainty visualization (20)

PDF
Societal Impact of Applied Data Science on the Big Data Stack
PPTX
algorithmic-decisions, fairness, machine learning, provenance, transparency
PDF
Contextual Analysis
PPT
Contractor-Borner-SNA-SAC
PPT
Ben Shneiderman: Thrill of Discovery
PDF
PDF
customized eager lazy data cleansing for satisfactory big data veracity
PDF
Data Science: Origins, Methods, Challenges and the future?
PPT
Introduction To Data Mining
PPT
Introduction To Data Mining
PDF
PERFORMANCE EVALUATION OF MACHINE LEARNING TECHNIQUES FOR DOS DETECTION IN WI...
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
Network Intrusion Detection using MRF Technique
PDF
Challenges in Analytics for BIG Data
PDF
Opportunities and Challenges in Crisis Informatics
PDF
Intelligent data analysis for medicinal diagnosis
PDF
IDS - Fact, Challenges and Future
PPTX
[DSC Europe 24] Stevan Vrbaski - DSC 2024 Belgrade Vinaver Medical.pptx
PPT
Machine Learning, Data Mining, and
PDF
Exploring uncertainty measures in deep networks for sclerosis
Societal Impact of Applied Data Science on the Big Data Stack
algorithmic-decisions, fairness, machine learning, provenance, transparency
Contextual Analysis
Contractor-Borner-SNA-SAC
Ben Shneiderman: Thrill of Discovery
customized eager lazy data cleansing for satisfactory big data veracity
Data Science: Origins, Methods, Challenges and the future?
Introduction To Data Mining
Introduction To Data Mining
PERFORMANCE EVALUATION OF MACHINE LEARNING TECHNIQUES FOR DOS DETECTION IN WI...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Network Intrusion Detection using MRF Technique
Challenges in Analytics for BIG Data
Opportunities and Challenges in Crisis Informatics
Intelligent data analysis for medicinal diagnosis
IDS - Fact, Challenges and Future
[DSC Europe 24] Stevan Vrbaski - DSC 2024 Belgrade Vinaver Medical.pptx
Machine Learning, Data Mining, and
Exploring uncertainty measures in deep networks for sclerosis

More from bdemchak (20)

PPTX
Cytoscape Network Visualization and Analysis
PDF
The New CyREST: Economical Delivery of Complex, Reproducible Network Biology ...
PDF
Cytoscape Cyberinfrastructure
PDF
No More Silos! Cytoscape CI Enables Interoperability
PPTX
Cytoscape CI Chapter 2
PPT
Composable Chat Introduction
PPT
Rich Services: Composable chat
PPT
Ucsd tum workshop bd
PPT
Rich Feeds for RESCUE and PALMS
PPT
Iscram 2008 presentation
PPT
Rich feeds policy, the cloud, and CAP
PPT
Rich services to the Rescue
PPTX
Hicss 2012 presentation
PPTX
Policy 2012 presentation
PPT
Rich feeds for rescue an integration story
PPT
Background scenario drivers and critical issues with a focus on technology ...
PPT
Rich feeds for rescue, palms cyberinfrastructure integration stories
PPTX
Web programming in clojure
PPTX
Structure and interpretation of computer programs modularity, objects, and ...
PPT
Requirements engineering from system goals to uml models to software specif...
Cytoscape Network Visualization and Analysis
The New CyREST: Economical Delivery of Complex, Reproducible Network Biology ...
Cytoscape Cyberinfrastructure
No More Silos! Cytoscape CI Enables Interoperability
Cytoscape CI Chapter 2
Composable Chat Introduction
Rich Services: Composable chat
Ucsd tum workshop bd
Rich Feeds for RESCUE and PALMS
Iscram 2008 presentation
Rich feeds policy, the cloud, and CAP
Rich services to the Rescue
Hicss 2012 presentation
Policy 2012 presentation
Rich feeds for rescue an integration story
Background scenario drivers and critical issues with a focus on technology ...
Rich feeds for rescue, palms cyberinfrastructure integration stories
Web programming in clojure
Structure and interpretation of computer programs modularity, objects, and ...
Requirements engineering from system goals to uml models to software specif...

Recently uploaded (20)

PPTX
GSA Content Generator Crack (2025 Latest)
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PPTX
Cybersecurity: Protecting the Digital World
PDF
Salesforce Agentforce AI Implementation.pdf
DOCX
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PDF
Cost to Outsource Software Development in 2025
PPTX
"Secure File Sharing Solutions on AWS".pptx
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PDF
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
CNN LeNet5 Architecture: Neural Networks
PDF
Website Design Services for Small Businesses.pdf
PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
Time Tracking Features That Teams and Organizations Actually Need
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Designing Intelligence for the Shop Floor.pdf
GSA Content Generator Crack (2025 Latest)
Topaz Photo AI Crack New Download (Latest 2025)
Cybersecurity: Protecting the Digital World
Salesforce Agentforce AI Implementation.pdf
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
Monitoring Stack: Grafana, Loki & Promtail
Computer Software and OS of computer science of grade 11.pptx
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Cost to Outsource Software Development in 2025
"Secure File Sharing Solutions on AWS".pptx
DNT Brochure 2025 – ISV Solutions @ D365
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
CNN LeNet5 Architecture: Neural Networks
Website Design Services for Small Businesses.pdf
iTop VPN Crack Latest Version Full Key 2025
Time Tracking Features That Teams and Organizations Actually Need
Oracle Fusion HCM Cloud Demo for Beginners
Designing Intelligence for the Shop Floor.pdf

Data quality and uncertainty visualization

  • 1. Data Quality and Uncertainty Visualization UC San Diego COGS 220 Winter Quarter 2006 Barry Demchak
  • 2. Immediate Motivation: Wiisard  A joint project of Veterans Administration and UC San Diego, funded by the National Library of Medicine  Mass casualty triage and treatment  Enter patient information via PDAs  Patient information summarized on tablet PCs  Command/control for supervisors and incident comment personnel  Tied together using 802.11b and store-and- forward database access
  • 3. Wiisard – Explosion with Pesticides
  • 4. Wiisard – Network Deployment
  • 7. Wiisard – The Problem  What if the network becomes partitioned?  Tablet display shows out-of-date patient information  Summary displays are out of date, too  How does this lead to bad decisions?  Supervisors may mis-deploy doctors  Incident command may mis-deploy resources People may die
  • 8. DOD Example  Sensor-to-shooter (STS) Networks – Patrick Driscoll (USMA), June 2002
  • 10. DOD Example  “… our first attempt to get the military community to realize that there is a degree of uncertainty involved in (digital) information systems that cannot be engineered out of the system.”  “Ultimately, our concern was an awareness issue (for the decision maker) …”  “… woman at MITRE had proposed a system of tagging intelligence starting at the source in a way that would reflect the uncertainty of the data being put into the intel database.”
  • 11. The Problem  How to visualize the uncertainty in data so that humans can exercise judgment in making the best decision  Accounting for uncertainty is not the same thing as visualizing uncertainty
  • 12. What Labs are Involved  MIT Sloan School of Management  Richard Wang (Data Quality)  Penn State University  Alan MacEachren (GIS)  University of Maine  Kate Beard-Tisdale (GIS)  University of California, Santa Cruz  Alex Pang (Scientific Visualization)  University of Arkansas, Little Rock  Master of Sciences in Information Quality
  • 13. What Conferences are There?  MIT Information Quality (IQatMIT)  ACM SIGMOD Workshop on Information Quality in  ACM SIGKDD (Knowledge Discovery and Data M  MIT International Conference on Information Qual
  • 15. Definition of Data Quality  From Wand & Wang:
  • 16. Metrics  Timeliness  How up to date relative to intended purpose  Ballou et al:  Timeliness = Max(0, 1-(currency/volatility)  Currency = delivery_time – input_time  Volatility = length of time data remains valid  Apply sensitivity factor “s”: Timeliness ^ s
  • 17. Interplay with Uncertainty  Metrics are application dependent  Metrics are data dependent  Metrics are user dependent  Question: If a metric describes an individual data element, what is the effect of aggregating data elements having uncertainty??
  • 18. GIS Examples – NCGIA  Sample point locations as overlay  Sample points and corresponding contours using naïve shading
  • 19. GIS Examples – NCGIA  Gray shading uncertainty surface captures distance function used by interpolation method  Uncertainty encoded in contour line widths
  • 20.  Fill Clarity  Resolution GIS Techniques  Contour Crispness  Fog
  • 21. Merging Data and Uncertainty  Risk and uncertainty separately  Risk and uncertainty combined
  • 24. Basic Data Examples  Ambiguation
  • 25. Basic Data Examples  Ambiguation
  • 29. Hue as Uncertainty  With out  With
  • 31.  Data Confidence  x is a device, α is decay constant, R(x) is a weighting for device x in the calculation Back to Wiisard ∑             •      − + = x xpingtime xposttimecurtime xRC α )( )( 1 1 )(
  • 32. Back to Wiisard  Individual data (annotation)  Aggregate data (annotated/integrated)
  • 33. Back to Wiisard  Annotated
  • 34. Back to Wiisard  Integrated
  • 35. Research Questions  What are the dimensions of metrics relevant for determining data quality for medical providers in a mass casualty context?  What kind of visualization best conveys the use suitability for various kinds of data?  Single data points  Streaming bioinformation  Aggregated information
  • 36. Research Questions  What kinds of visualizations are best suited to field personnel?  Non-IS frenzied technicians  High glare, small footprint screens  Low processing power  What kinds of visualizations are best suited to incident command?  Seasoned experts  Large, high density displays  Highly connected with high data processing
  • 37. Conclusion  Data Quality and Uncertainty Visualization are like the weather … … everyone’s talks about it, but no one does anything about it

Editor's Notes

  • #11: Comp Sci folks worry about bits being lost and transmission checking, principally in the data assurance domain. We were thinking more abstractly about the information products floating around such systems. What happens to data uncertainty as data gets integrated/aggregated over and over again? Is there a point at which it becomes useless??
  • #19: National Center for Geographic Information and Analysis: Visualization of the Quality of Spacial Information, May 1994 Visualize interpolation uncertainty in GIS measurements of some phenomenon,
  • #21: Visualizing Uncertain Information – Alan MacEacheren (Penn State), 1992
  • #22: Map of health risks due to air pollutants.
  • #23: From Visualizing Data with Bounded Uncertainty … Jock Mackinlay
  • #24: From Visualizing Data with Bounded Uncertainty … Jock Mackinlay
  • #25: From Visualizing Data with Bounded Uncertainty … Jock Mackinlay
  • #26: From Visualizing Data with Bounded Uncertainty … Jock Mackinlay
  • #27: From Visualizing Uncertainty for Improved Decision Making by Griethe and Schumann, University of Rostock Uncertainty about the true architecture of the medieval Kaiserpfalz is encoded as transparency
  • #28: Craig M. Wittenbrink and Alex T. Pang and Suresh K. Lodha."Glyphs for Visualizing Uncertainty in Vector Fields".In IEEE Transactions on Visualization and Computer Graphics,, vol. 2, no. 3, pp. 266--279, September, 1996. Shows suitability of various types of glyphs for representing anularity and magnitude
  • #29: From Visualizing Uncertainty for Improved Decision Making by Griethe and Schumann, University of Rostock Monterey Bay … magnitude and directional uncertainty in a flow vector field
  • #30: Photos are 3 different resolutions of a CAT scan of a cadaver (highest to lowest) From Uncertainty Visualization Methods in Isosurface Rendering by Rhodes, Laramee, et al (University of New Hampshire and VRVis in Austria … published in Eurographics 2003