1
Information Visualization:
Principles, Promise, and Pragmatics
Marti Hearst
CHI 2003 Tutorial
2
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
3
Introduction
• Goals of Information Visualization
• Case Study: The Journey of the TreeMap
• Key Questions
4
What is Information Visualization?
Visualize: to form a mental image or vision of …
Visualize: to imagine or remember as if actually
seeing.
American Heritage dictionary, Concise Oxford dictionary
5
What is Information Visualization?
“Transformation of the symbolic into the geometric”
(McCormick et al., 1987)
“... finding the artificial memory that best
supports our natural means of perception.''
(Bertin, 1983)
The depiction of information using spatial or graphical
representations, to facilitate comparison, pattern
recognition, change detection, and other cognitive skills
by making use of the visual system.
6
Information Visualization
• Problem:
– HUGE Datasets: How to understand them?
• Solution
– Take better advantage of human perceptual system
– Convert information into a graphical representation.
• Issues
– How to convert abstract information into graphical form?
– Do visualizations do a better job than other methods?
7
Images from yahoo.com
Visualization
Success Stories
8
Image from mapquest.com
The Power of Visualization
1. Start out going Southwest on ELLSWORTH AVE
Towards BROADWAY by turning right.
2: Turn RIGHT onto BROADWAY.
3. Turn RIGHT onto QUINCY ST.
4. Turn LEFT onto CAMBRIDGE ST.
5. Turn SLIGHT RIGHT onto MASSACHUSETTS AVE.
6. Turn RIGHT onto RUSSELL ST.
9
The Power of Visualization
Line drawing tool by Maneesh Agrawala http://graphics.stanford.edu/~maneesh/
10
Visualization Success Story
Mystery: what is causing a cholera
epidemic in London in 1854?
11
Visualization Success Story
From Visual
Explanations by
Edward Tufte,
Graphics Press,
1997
Illustration of John
Snow’s
deduction that a
cholera epidemic
was caused by a bad
water pump, circa
1854.
Horizontal lines
indicate location of
deaths.
12
Visualization Success Story
From Visual Explanations by Edward Tufte,
Graphics Press, 1997
Illustration of John
Snow’s
deduction that a cholera
epidemic
was caused by a bad
water pump, circa 1854.
Horizontal lines indicate
location of deaths.
13
Purposes of Information Visualization
To help:
Explore
Calculate
Communicate
Decorate
14
Two Different Primary Goals:
Two Different Types of Viz
Explore/Calculate
Analyze
Reason about Information
Communicate
Explain
Make Decisions
Reason about Information
15
Goals of Information Visualization
More specifically, visualization should:
– Make large datasets coherent
(Present huge amounts of information compactly)
– Present information from various viewpoints
– Present information at several levels of detail
(from overviews to fine structure)
– Support visual comparisons
– Tell stories about the data
16
Why Visualization?
Use the eye for pattern recognition; people are good at
scanning
recognizing
remembering images
Graphical elements facilitate comparisons via
length
shape
orientation
texture
Animation shows changes across time
Color helps make distinctions
Aesthetics make the process appealing
18
The Need for Critical Analysis
• We see many creative ideas, but they often fail in
practice
• The hard part: how to apply it judiciously
– Inventors usually do not accurately predict how their
invention will be used
• This tutorial will emphasize
– Getting past the coolness factor
– Examining usability studies
19
Case Study:
The Journey of the TreeMap
• The TreeMap (Johnson & Shneiderman ‘91)
• Idea:
– Show a hierarchy as a 2D layout
– Fill up the space with rectangles representing objects
– Size on screen indicates relative size of underlying
objects.
20
Early Treemap Applied to File System
21
Treemap Problems
• Too disorderly
– What does adjacency mean?
– Aspect ratios uncontrolled leads to lots of skinny boxes
that clutter
• Color not used appropriately
– In fact, is meaningless here
• Wrong application
– Don’t need all this to just see the largest files in the OS
22
Successful Application of Treemaps
• Think more about the use
– Break into meaningful groups
– Fix these into a useful aspect ratio
• Use visual properties properly
– Use color to distinguish meaningfully
• Use only two colors:
– Can then distinguish one thing from another
• When exact numbers aren’t very important
• Provide excellent interactivity
– Access to the real data
– Makes it into a useful tool
23
TreeMaps in Action
http://www.smartmoney.com/maps
http://www.peets.com/tast/11/coffee_selector.asp
24
A Good Use of TreeMaps and Interactivity
www.smartmoney.com/marketmap
25
Treemaps in Peets site
26
Analysis vs. Communication
• MarketMap’s use of TreeMaps allows for
sophisticated analysis
• Peets’ use of TreeMaps is more for
presentation and communication
• This is a key contrast
27
Open Issues
• Does visualization help?
– The jury is still out
– Still supplemental at best for text collections
• A correlation with spatial ability
• Learning effects: with practice ability on visual display
begins to equal that of text
• Does visualization sell?
– Jury is still out on this one too!
• This is a hot area! More ideas will appear!
28
Key Questions to Ask about a Viz
1. What does it teach/show/elucidate?
2. What is the key contribution?
3. What are some compelling, useful examples?
4. Could it have been done more simply?
5. Have there been usability studies done?
What do they show?
29
What we are not covering
• Scientific visualization
• Statistics
• Cartography (maps)
• Education
• Games
• Computer graphics in general
• Computational geometry
30
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
31
Visual Principles
32
Visual Principles
– Types of Graphs
– Pre-attentive Properties
– Relative Expressiveness of Visual Cues
– Visual Illusions
– Tufte’s notions
• Graphical Excellence
• Data-Ink Ratio Maximization
• How to Lie with Visualization
33
References for Visual Principles
• Kosslyn: Types of Visual Representations
• Lohse et al: How do people perceive common
graphic displays
• Bertin, MacKinlay: Perceptual properties and
visual features
• Tufte/Wainer: How to mislead with graphs
34
A Graph is: (Kosslyn)
• A visual display that illustrates one or more
relationships among entities
• A shorthand way to present information
• Allows a trend, pattern, or comparison to be
easily apprehended
35
Types of Symbolic Displays
(Kosslyn 89)
• Graphs
• Charts
• Maps
• Diagrams
Type name here
Type title here
Type name here
Type title here
Type name here
Type title here
Type name here
Type title here
Types of Symbolic Displays
• Graphs
– at least two scales required
– values associated by a symmetric “paired with”
relation
• Examples: scatter-plot, bar-chart, layer-graph
Types of Symbolic Displays
Charts
– discrete relations among discrete entities
– structure relates entities to one another
– lines and relative position serve as links
Examples:
family tree
flow chart
network diagram
Types of Symbolic Displays
• Maps
– internal relations determined (in part) by the spatial
relations of what is pictured
– labels paired with locations
Examples:
map of census data
topographic maps
From www.thehighsierra.com
Types of Symbolic Displays
Diagrams
– schematic pictures of objects or entities
– parts are symbolic (unlike photographs)
• how-to illustrations
• figures in a manual
From Glietman, Henry. Psychology.
W.W. Norton and Company, Inc.
New York, 1995
Anatomy of a Graph (Kosslyn 89)
• Framework
– sets the stage
– kinds of measurements, scale, ...
• Content
– marks
– point symbols, lines, areas, bars, …
• Labels
– title, axes, tic marks, ...
Basic Types of Data
• Nominal (qualitative)
– (no inherent order)
– city names, types of diseases, ...
• Ordinal (qualitative)
– (ordered, but not at measurable intervals)
– first, second, third, …
– cold, warm, hot
• Interval (quantitative)
– list of integers or reals
Common Graph Types
length of page
length
of
access
URL
#
of
accesses
length of access
#
of
accesses length of access
length
of
page
0
5
10
15
20
25
30
35
40
45
short
medium
long
very
long
days
#
of
accesses
url 1
url 2
url 3
url 4
url 5
url 6
url 7
# of accesses
Combining Data Types in Graphs
Nominal Nominal
Nominal Ordinal
Nominal Interval
Ordinal Ordinal
Ordinal Interval
Interval Interval
Examples?
Scatter Plots
• Qualitatively determine if variables
– are highly correlated
• linear mapping between horizontal & vertical axes
– have low correlation
• spherical, rectangular, or irregular distributions
– have a nonlinear relationship
• a curvature in the pattern of plotted points
• Place points of interest in context
– color representing special entities
When to use which type?
• Line graph
– x-axis requires quantitative variable
– Variables have contiguous values
– familiar/conventional ordering among ordinals
• Bar graph
– comparison of relative point values
• Scatter plot
– convey overall impression of relationship between two
variables
• Pie Chart?
– Emphasizing differences in proportion among a few
numbers
Classifying Visual
Representations
Lohse, G L; Biolsi, K; Walker, N and H H Rueter,
A Classification of Visual Representations
CACM, Vol. 37, No. 12, pp 36-49, 1994
Participants sorted 60 items into categories
Other participants assigned labels from Likert scales
Experimenters clustered the results various ways.
Subset of Example Visual Representations
From Lohse et al. 94
Subset of Example Visual Representations
From Lohse et al. 94
Likert Scales
(and percentage of variance explained)
16.0 emphasizes whole – parts
11.3 spatial – nonspatial
10.6 static structure – dynamic structure
10.5 continuous – discrete
10.3 attractive – unattractive
10.1 nontemporal – temporal
9.9 concrete – abstract
9.6 hard to understand – easy
9.5 nonnumeric – numeric
2.2 conveys a lot of info – conveys little
Experimentally Motivated
Classification (Lohse et al. 94)
• Graphs
• Tables (numerical)
• Tables (graphical)
• Charts (time)
• Charts (network)
• Diagrams (structure)
• Diagrams (network)
• Maps
• Cartograms
• Icons
• Pictures
Interesting Findings
Lohse et al. 94
• Photorealistic images were least informative
– Echos results in icon studies – better to use less complex,
more schematic images
• Graphs and tables are the most self-similar categories
– Results in the literature comparing these are inconclusive
• Cartograms were hard to understand
– Echos other results – better to put points into a framed
rectangle to aid spatial perception
• Temporal data more difficult to show than cyclic data
– Recommend using animation for temporal data
Visual Properties
• Preattentive Processing
• Accuracy of Interpretation of Visual Properties
• Illusions and the Relation to Graphical
Integrity
All Preattentive Processing figures from Healey 97
http://www.csc.ncsu.edu/faculty/healey/PP/PP.html
Preattentive Processing
• A limited set of visual properties are processed
preattentively
– (without need for focusing attention).
• This is important for design of visualizations
– what can be perceived immediately
– what properties are good discriminators
– what can mislead viewers
Example: Color Selection
Viewer can rapidly and accurately determine
whether the target (red circle) is present or absent.
Difference detected in color.
Example: Shape Selection
Viewer can rapidly and accurately determine
whether the target (red circle) is present or absent.
Difference detected in form (curvature)
Pre-attentive Processing
• < 200 - 250ms qualifies as pre-attentive
– eye movements take at least 200ms
– yet certain processing can be done very quickly,
implying low-level processing in parallel
• If a decision takes a fixed amount of time
regardless of the number of distractors, it is
considered to be preattentive.
Example: Conjunction of
Features
Viewer cannot rapidly and accurately determine
whether the target (red circle) is present or absent when
target has two or more features, each of which are
present in the distractors. Viewer must search sequentially.
All Preattentive Processing figures from Healey 97
http://www.csc.ncsu.edu/faculty/healey/PP/PP.html
Example: Emergent Features
Target has a unique feature with respect to
distractors (open sides) and so the group
can be detected preattentively.
Example: Emergent Features
Target does not have a unique feature with respect to
distractors and so the group cannot be detected
preattentively.
Asymmetric and Graded
Preattentive Properties
• Some properties are asymmetric
– a sloped line among vertical lines is preattentive
– a vertical line among sloped ones is not
• Some properties have a gradation
– some more easily discriminated among than others
Use Grouping of Well-Chosen
Shapes for Displaying Multivariate
Data
SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO
CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM
SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
Text NOT Preattentive
Preattentive Visual Properties
(Healey 97)
length Triesman & Gormican [1988]
width Julesz [1985]
size Triesman & Gelade [1980]
curvature Triesman & Gormican [1988]
number Julesz [1985]; Trick & Pylyshyn [1994]
terminators Julesz & Bergen [1983]
intersection Julesz & Bergen [1983]
closure Enns [1986]; Triesman & Souther [1985]
colour (hue) Nagy & Sanchez [1990, 1992]; D'Zmura [1991]
Kawai et al. [1995]; Bauer et al. [1996]
intensity Beck et al. [1983]; Triesman & Gormican [1988]
flicker Julesz [1971]
direction of motion Nakayama & Silverman [1986]; Driver & McLeod [1992]
binocular lustre Wolfe & Franzel [1988]
stereoscopic depth Nakayama & Silverman [1986]
3-D depth cues Enns [1990]
lighting direction Enns [1990]
Gestalt Properties
• Gestalt: form or configuration
• Idea: forms or patterns transcend the stimuli
used to create them.
– Why do patterns emerge?
– Under what circumstances?
Why perceive pairs vs. triplets?
Gestalt Laws of Perceptual
Organization (Kaufman 74)
• Figure and Ground
– Escher illustrations are good examples
– Vase/Face contrast
• Subjective Contour
More Gestalt Laws
• Law of Proximity
– Stimulus elements that are close together will be
perceived as a group
• Law of Similarity
– like the preattentive processing examples
• Law of Common Fate
– like preattentive motion property
• move a subset of objects among similar ones and they
will be perceived as a group
Which Properties are
Appropriate for Which
Information Types?
Accuracy Ranking of Quantitative Perceptual Tasks
Estimated; only pairwise comparisons have been validated
(Mackinlay 88 from Cleveland & McGill)
Interpretations of Visual Properties
Some properties can be discriminated more accurately
but don’t have intrinsic meaning
(Senay & Ingatious 97, Kosslyn, others)
– Density (Greyscale)
Darker -> More
– Size / Length / Area
Larger -> More
– Position
Leftmost -> first, Topmost -> first
– Hue
??? no intrinsic meaning
– Slope
??? no intrinsic meaning
QUANTITATIVE ORDINAL NOMINAL
Position Position Position
Length Density Color Hue
Angle Color Saturation Texture
Slope Color Hue Connection
Area Texture Containment
Volume Connection Density
Density Containment Color Saturation
Color Saturation Length Shape
Color Hue Angle Length
Ranking of Applicability of Properties for
Different Data Types
(Mackinlay 88, Not Empirically Verified)
Color Purposes
• Call attention to specific items
• Distinguish between classes of items
– Increases the number of dimensions for encoding
• Increase the appeal of the visualization
Using Color
• Proceed with caution
– Less is more
– Representing magnitude is tricky
• Examples
– Red-orange-yellow-white
• Works for costs
• Maybe because people are very experienced at
reasoning shrewdly according to cost
– Green-light green-light brown-dark brown-grey-white
works for atlases
– Grayscale is unambiguous but has limited range
Visual Illusions
• People don’t perceive length, area, angle,
brightness they way they “should”.
• Some illusions have been reclassified as
systematic perceptual errors
– e.g., brightness contrasts (grey square on white
background vs. on black background)
– partly due to increase in our understanding of the
relevant parts of the visual system
• Nevertheless, the visual system does some
really unexpected things.
Illusions of Linear Extent
• Mueller-Lyon (off by 25-30%)
• Horizontal-Vertical
Illusions of Area
• Delboeuf Illusion
• Height of 4-story building overestimated by
approximately 25%
What are good guidelines for Infoviz?
• Use graphics appropriately
– Don’t use images gratuitously
– Don’t lie with graphics!
• Link to original data
– Don’t conflate area with other information
• E.g., use area in map to imply amount
• Make it interactive (feedback)
– Brushing and linking
– Multiple views
– Overview + details
• Match mental models
80
Tufte
• Principles of Graphical Excellence
– Graphical excellence is
• the well-designed presentation of interesting data – a
matter of substance, of statistics, and of design
• consists of complex ideas communicated with clarity,
precision and efficiency
• is that which gives to the viewer the greates number of
ideas in the shortest time with the least ink in the
smallest space
• requires telling the truth about the data.
81
Tufte’s Notion of Data Ink
Maximization
• What is the main idea?
– draw viewers attention to the substance of the
graphic
– the role of redundancy
– principles of editing and redesign
• What’s wrong with this? What is he really
getting at?
82
Tufte Principle
Maximize the data-ink ratio:
data ink
Data-ink ratio = --------------------------
total ink used in graphic
Avoid “chart junk”
83
Tufte Principles
• Use multifunctioning graphical elements
• Use small multiples
• Show mechanism, process, dynamics, and
causality
• High data density
– Number of items/area of graphic
– This is controversial
• White space thought to contribute to good visual
design
• Tufte’s book itself has lots of white space
84
Tufte’s Graphical Integrity
• Some lapses intentional, some not
• Lie Factor = size of effect in graph
size of effect in data
• Misleading uses of area
• Misleading uses of perspective
• Leaving out important context
• Lack of taste and aesthetics
85
From Tim Craven’s LIS 504 course
http://instruct.uwo.ca/fim-lis/504/504gra.htm#data-ink_ratio
86
How to Exaggerate with Graphs
from Tufte ’83
“Lie factor” = 2.8
87
How to Exaggerate with Graphs
from Tufte ’83
Error:
Shrinking
along both
dimensions
88
Howard Wainer
How to Display Data Badly
(Video)
http://www.dartmouth.edu/~chance/ChanceLecture/AudioVideo.html
89
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
90
Promising Techniques
91
Promising Techniques & Approaches
• Perceptual Techniques
– Animation
– Grouping / Gestalt principles
– Using size to indicate quantity
– Color for Accent, Distinction, Selection
• NOT FOR QUANTITY!!!!
• General Approaches
– Standard Techniques
• Graphs, bar charts, tables
– Brushing and Linking
– Providing Multiple Views and Models
– Aesthetics!
92
Standard Techniques
• It’s often hard to beat:
– Line graphs, bar charts
– Scatterplots (or Scatterplot Matrix)
– Tables
• A Darwinian view of visualizations:
– Only the fittest survive
– We are in a period of great experimentation; eventually it
will be clear what works and what dies out.
• A bright spot:
– Enhancing the old techniques with interactivity
– Example: Spotfire
• Adds interactivity, color highlighting, zooming to scatterplots
– Example: TableLens / Eureka
• Adds interactivity and length cues to tables
93
Spotfire: Integrating Interaction
with Scatterplots
94
Spotfire/IVEE: Integrating
Interaction with Scatterplots
Brushing and Linking
• Interactive technique
– Highlighting
– Brushing and Linking
• At least two things must be linked together to
allow for brushing
– select a subset of points
– see the role played by this subset of points in one or
more other views
• Example systems
– Graham Will’s EDV system
– Ahlberg & Sheiderman’s IVEE (Spotfire)
Linking types of assist behavior
to position played (from Eick & Wills 95)
Baseball data:
Scatterplots and histograms and
bars (from Eick & Wills 95)
select high
salaries
avg career
HRs vs avg
career hits
(batting ability)
avg assists vs
avg putouts
(fielding ability)
how long
in majors
distribution
of positions
played
What was learned from interaction
with this baseball data?
– Seems impossible to earn a high salary in the first
three years
– High salaried players have a bimodal distribution
(peaking around 7 & 13 yrs)
– Hits/Year a better indicator of salary than HR/Year
– High paid outlier with low HR and medium hits/year.
Reason: person is player-coach
– There seem to be two differentiated groups in the
put-outs/assists category (but not correlated with
salary) Why?
99
Slide by Saifon Obromsook & Linda Harjono
Animation
• “The quality or condition of being alive, active, spirited,
or vigorous” (dictionary.com)
• “A dynamic visual statement that evolves through
movement or change in the display”
• “… creating the illusion of change by rapidly displaying a
series of single frames” (Roncarelli 1988).
100
Slide by Saifon Obromsook & Linda Harjono
We Use Animation to…
• Tell stories / scenarios: cartoons
• Illustrate dynamic process / simulation
• Create a character / an agent
• Navigate through virtual spaces
• Draw attention
• Delight
101
Slide by Saifon Obromsook & Linda Harjono
Cartoon Animation Principles
• Chang & Unger ‘93
• Solidity (squash and stretch)
– Solid drawing
– Motion blur
– Dissolves
• Exaggeration
– Anticipation
– Follow through
• Reinforcement
– Slow in and slow out
– Arcs
– Follow through
102
Slide by Saifon Obromsook & Linda Harjono
Why Cartoon-Style Animation?
• Cartoons’ theatricality is powerful in
communicating to the user.
• Cartoons can make UI engage the user into its
world.
• The medium of cartoon animation is like that
of graphic computers.
103
Application using Animation:
Gnutellavision
• Visualization of Peer-to-Peer Network
– Hosts (with color for status and size for number of files)
– Nodes with closer network distance from focus on inner
rings
– Queries shown; can trace queries
• Gnutellavision as exploratory tool
– Very few hosts share many files
– Uneven propagation of queries
– Qualitative assessment of queries (simple)
104
Layout - Illustration
105
Animation in Gnutellavision
Goal of animation is to help maintain context of nodes
and general orientation of user during refocus
• Transition Paths
– Linear interpolation of polar coordinates
– Node moves in arc not straight line
– Moves along circle if not changing levels (like great
circles on earth)
– Spirals in or out to next ring
106
Animation (continued)
• Transition constraints
– Orientation of transition to minimize rotational
travel
– (Move former parent away from new focus in same
orientation)
– Avoid cross-over of edges
– (to allow users to keep track of which is which)
• Animation timing
– Slow in Slow out timing (allows users to better
track movement)
107
Transition Constraint - Orientation
108
Transition Constraint - Order
109
Usability Testing
• In general, users appreciated the subtleties added to the general
method when the number of nodes increased.
• Perhaps the most interesting result is that most people preferred
rectangular movement for the small graph and polar coordinate
movement for the large one.
Overall Preference of Users
No Features All Features
Small Graph 5 5
Large Graph 1 9
110
Hyperbolic Tree
• A Focus+Context Technique Based on Hyperbolic Geometry for
Visualizing Large Hierarchies (1995) John Lamping, Ramana Rao, Peter
Pirolli Proc. ACM Conf. Human Factors in Computing Systems, CHI
• Also uses animation
• Tree-based layout; leaves stretch to infinity
• Only a few labels can be seen at a time
111
112
113
114
115
Issues
• Displaying text
– The size of the text
• Works good for small things like directories
• Not so good for URLs
• Only a portion of the data can be seen in the
focus at one time
• Only works for certain types of data -
Hierarchical
• Not clear if it is actually useful for anything.
116
Slide by Saifon Obromsook & Linda Harjono
Animating Algorithms
• Kehoe, Stasko, and Taylor, “Rethinking Evaluation of
Algorithm Animations as Learning Aids”
• Why previous studies present no benefits:
– No or limited benefits from particular animations
– Benefits are not captured in measurements
– Design of experiments hides the benefits
• Methods for this study:
– Combination of qualitative & quantitative
– More flexible setting
– Metrics: score for each type of questions, time used,
usage of materials, qualitative data from observations &
interviews
117
118
Slide by Saifon Obromsook & Linda Harjono
Findings
• Value of animation is more apparent in
interactive situations
• Most useful to learn procedural operations
• Makes subject more accessible & less
intimidating  increase motivation
119
What Isn’t Working?
The existing studies indicate that we don’t yet
know how to make the following work well for
every-day tasks:
– Pan-and-Zoom
– 3D Navigation
– Node-and-link representations of concept spaces
120
Zoom, Overview + Detail
• An exception, possibly:
– Benjamin B. Bederson: PhotoMesa: a zoomable image browser using
quantum treemaps and bubblemaps. UIST 2001: 71-80
121
Overview + Detail
• K. Hornbaek et al., Navigation patterns and Usability of
Zoomable User Interfaces with and without an Overview, ACM
TOCHI, 9(4), December 2002.
122
Overview + Detail
• K. Hornbaek et al., Navigation patterns and Usability of Zoomable
User Interfaces with and without an Overview, ACM TOCHI, 9(4),
December 2002.
• A study on integrating Overview + Detail on a Map
search task
– Incorporating panning & zooming as well.
– They note that panning & zooming does not do well in
most studies.
• Results seem to be
– Subjectively, users prefer to have a linked overview
– But they aren’t necessarily faster or more effective using it
– Well-constructed representation of the underlying data
may be more important.
• More research needed as each study seems to turn up
different results, sensitive to underlying test set.
123
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
124
Problem Solving
125
Problem Solving
• A Detective Tool for Multidimensional Data
– Inselberg on using Parallel Coordinates
• Analyzing Web Clickstream Data
– Brainerd & Becker, Waterson et al.
• Information Visualization for Pattern Detection
– Carlis & Konstan on Periodic Data
• Visualization vs. Analysis
– Comments by Wesley Johnson of Chevron
126
Multidimensional Detective
A. Inselberg, Multidimensional Detective, Proceedings of IEEE
Symposium on Information Visualization (InfoVis '97), 1997.
127
A Detective Story
A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on
Information Visualization (InfoVis '97), 1997
Inselberg’s Principles for analysis using visualizations:
1. Do not let the picture scare you
2. Understand your objectives
– Use them to obtain visual cues
3. Carefully scrutinize the picture
4. Test your assumptions, especially the “I am really sure of’s”
5. You can’t be unlucky all the time!
128
A Detective Story
A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information
Visualization (InfoVis '97), 1997
• The Dataset:
– Production data for 473 batches of a VLSI chip
– 16 process parameters
– The yield: % of produced chips that are useful
• X1
– The quality of the produced chips (speed)
• X2
– 10 types of defects (zero defects shown at top)
• X3 … X12
– 4 physical parameters
• X13 … X16
• The Objective:
– Raise the yield (X1) and maintain high quality (X2)
129
Multidimensional Detective
A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on
Information Visualization (InfoVis '97), 1997.
Do Not Let the Picture Scare You!!
130
Multidimensional Detective
• Each line represents the values for one batch of chips
• This figure shows what happens when only those
batches with both high X1 and high X2 are chosen
• Notice the separation in values at X15
• Also, some batches with few X3 defects are not in this
high-yield/high-quality group.
131
Multidimensional Detective
• Now look for batches which have nearly zero defects.
– For 9 out of 10 defect categories
• Most of these have low yields
• Surprising because we know from first diagram that some
defects are ok.
• Go back to first diagram, looking at defect categories
• Notice that X6 behaves differently than the rest
• Allow two defects, where one defect in X6
• This results in the very best batch appearing
132
Multidimensional Detective
• Fig 5 and 6 show that high yield batches don’t have non-zero values
for defects of type X3 and X6
– Don’t believe your assumptions …
• Looking now at X15 we see the separation is important
– Lower values of this property end up in the better yield batches
133
Automated Analysis
A. Inselberg, Automated Knowledge Discovery using Parallel
Coordinates, INFOVIS ‘99
134
Slide by Wayne Kao
Case Study: E-Commerce
Clickstream Visualization
• Brainerd & Becker, IEEE
Infovis 2001
• Aggregate nodes
using an icon (e.g. all
the checkout pages)
• Edges represent
transitions
– Wider means more
transitions
135
Slide by Wayne Kao
Customer Segments
• Collect
– Clickstream
– Purchase history
– Demographic data
• Associates customer data with their
clickstream
• Different color for each customer segment
136
Slide by Wayne Kao
Layout
• Aggregation based on file system path
137
Slide by Wayne Kao
Initial Findings
• Gender shopping
differences
138
Slide by Wayne Kao
Initial Findings (cont)
• Checkout process analysis
• Newsletter hurting sales
139
Slide by Wayne Kao
WebQuilt
Interactive, zoomable directed graph
• Nodes = web pages
• Edges = aggregate traffic between
pages
Waterson et al.,``What Did They Do?
Understanding Clickstreams with the WebQuilt
Visualization System.'' in AVI 2002.
140
Slide by Wayne Kao
Directed graph
• Nodes: visited pages
– Color marks entry and exit nodes
• Arrows: traversed links
– Thicker: more heavily traversed
– Color
• Red/yellow: Time spend before
clicking
• Blue: optimal path chosen by
designer
141
Slide by Wayne Kao
142
Slide by Wayne Kao
Pilot Usability Study
• Edmunds.com PDA web site
• Visor Handspring equipped with a OmniSky
wireless modem
• 10 users asked to find…
– Anti-lock brake information on the latest Nissan
Sentra model
– The Nissan dealer closest to them.
143
Slide by Wayne Kao
In the Lab vs. Out in the Wild
Comparing in-lab usability testing with WebQuilt remote
usability testing
• 5 users were tested in the lab
• 5 were given the device and asked to perform the task
at their convenience
• All task directions, demographic data, and follow up
questionnaire data was presented and collected in web
forms as part of the WebQuilt testing framework.
144
Slide by Wayne Kao
145
Slide by Wayne Kao
146
Slide by Wayne Kao
147
Slide by Wayne Kao
Browser Device
Interact before load (3)
No forward button (2)
Difficulty with input in
questionnaire (3)
Difficulty scrolling (2)
Device errors unrelated to
testing (1)
Tried writing on screen (0)
Site Design Test Design
 Falsely completed task (4)
 Long download times (4)
 Ping-pong behavior (3)
 Interact before load (3)
 Too much scrolling (2)
 Save address functionality
not clear (1)
 Back button navigation (0)
 Would like more features (0)
 Finds site useful (0)
 Falsely completed task (4)
 Difficulty remembering
task description (3)
 Difficulty with input in
questionnaire (3)
 Questionnaire wording
problems (3)
 Forgot how to end task (1)
 Confusing task description
(1)
Findings
148
Slide by Wayne Kao
Findings
• WebQuilt methodology is promising for uncovering site
design related issues.
• 1/3 of the issues were device or browser related.
• Browser and device issues can not be captured
automatically with WebQuilt unless they cause an
interaction with the server
• Can be revealed via the questionnaire data.
149
Visualization for Analysis
• Carlis & Konstan, UIST 1998
• Problem: data that is both periodic and serial
– Time students spend on different activities
– Tree growth patterns
• Time: which year
• Period: yearly
– Multi-day races such as the Tour de France
– Calendars arbitrarily wrap around at end of month
– Octaves in music
• How to find patterns along both dimensions?
150
Analyzing Complex Periodic Data
Carlis & Konstan, UIST 1998.
151
Analyzing Complex Periodic Data
Carlis & Konstan, UIST 1998.
•Consumption values for
each month appear as spikes
•Each food has its own color
•Boundary line (in black)
shows when season
begins/ends
152
Carlis & Konstan, UIST 1998.
153
Visualization vs. Analysis?
• Applications to data mining and data discovery.
• Wesley Johnson ’02:
– Visualization tools are helpful for exploring hunches and
presenting results
• Examples: scatterplots
– They are the WRONG primary tool when the goal is to find
a good classifier model in a complex situation.
– Need:
• Solid insight into the domain and problem
• Tools that visualize several alternative models.
• Emphasize “model visualization” rather than “data
visualization”
154
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
155
Visualizing Documents and
Search
156
Documents and Search
• Why Visualize Text?
• Why Text is Tough
• Visualizing Concept Spaces
– Clusters
– Category Hierarchies
• Visualizing Retrieval Results
• Usability Study Meta-Analysis
157
Why Visualize Text?
• To help with Information Retrieval
– give an overview of a collection
– show user what aspects of their interests are
present in a collection
– help user understand why documents retrieved as a
result of a query
• Text Data Mining
– Mainly clustering & nodes-and-links
• Software Engineering
– not really text, but has some similar properties
158
Why Text is Tough
• Text is not pre-attentive
• Text consists of abstract concepts
– which are difficult to visualize
• Text represents similar concepts in many
different ways
– space ship, flying saucer, UFO, figment of imagination
• Text has very high dimensionality
– Tens or hundreds of thousands of features
– Many subsets can be combined together
159
Why Text is Tough
As the man walks the cavorting dog, thoughts
arrive unbidden of the previous spring, so unlike
this one, in which walking was marching and
dogs were baleful sentinals outside unjust halls.
How do we visualize this?
160
Why Text is Tough
• Abstract concepts are difficult to visualize
• Combinations of abstract concepts are even
more difficult to visualize
– time
– shades of meaning
– social and psychological concepts
– causal relationships
161
Why Text is Tough
• Language only hints at meaning
• Most meaning of text lies within our minds and
common understanding
– “How much is that doggy in the window?”
• how much: social system of barter and trade (not the
size of the dog)
• “doggy” implies childlike, plaintive, probably cannot do
the purchasing on their own
• “in the window” implies behind a store window, not
really inside a window, requires notion of window
shopping
162
Why Text is Easy
• Text is highly redundant
– When you have lots of it
– Pretty much any simple technique can pull out phrases that
seem to characterize a document
• Instant summary:
– Extract the most frequent words from a text
– Remove the most common English words
• People are very good at attributing meaning to lists
of otherwise unrelated words
163
Guess the Text:
10 PEOPLE
10 ALL
9 STATES
9 LAWS
8 NEW
7 RIGHT
7 GEORGE
6 WILLIAM
6 THOMAS
6 JOHN
6 GOVERNMENT
5 TIME
5 POWERS
5 COLONIES
4 LARGE
4 INDEPENDENT
4 FREE
4 DECLARATION
4 ASSENT
3 WORLD
3 WAR
3 USURPATIONS
3 UNITED
3 SEAS
3 RIGHTS
166
Visualization of Text Collections
• How to summarize the contents of hundreds,
thousands, tens of thousands of texts?
• Many have proposed clustering the words and
showing points of light in a 2D or 3D space.
• Examples
– Showing docs/collections as a word space
– Showing retrieval results as points in word space
167
TextArc.org (Bradford Paley)
168
TextArc.org (Bradford Paley)
169
Galaxy of News
Rennison 95
170
Galaxy of News
Rennison 95
171
Example: Themescapes
(Wise et al. 95)
Themescapes (Wise et al. 95)
172
ScatterPlot
of
Clusters
(Chen
et
al.
97)
173
Kohonen
Feature
Maps
(Lin
92,
Chen
et
al.
97)
(594 docs)
175
Clustering for Collection
Overviews
• Two main steps
– cluster the documents according to the words they
have in common
– map the cluster representation onto a (interactive)
2D or 3D representation
• Since text has tens of thousands of features
– the mapping to 2D loses a tremendous amount of
information
– only very coarse themes are detected
177
Scatter/Gather
Cutting, Pedersen, Tukey & Karger 92, 93, Hearst & Pedersen 95
181
How Useful is Collection Cluster
Visualization for Search?
Three studies find negative results
182
Study 1
Kleiboemer, Lazear, and Pedersen. Tailoring a retrieval system
for naive users. In Proc. of the 5th Annual Symposium on
Document Analysis and Information Retrieval, 1996
• This study compared
– a system with 2D graphical clusters
– a system with 3D graphical clusters
– a system that shows textual clusters
• Novice users
• Only textual clusters were helpful (and they
were difficult to use well)
183
Study 2: Kohonen Feature Maps
H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7)
• Comparison: Kohonen Map and Yahoo
• Task:
– “Window shop” for interesting home page
– Repeat with other interface
• Results:
– Starting with map could repeat in Yahoo (8/11)
– Starting with Yahoo unable to repeat in map (2/14)
184
Study 2 (cont.)
• Participants liked:
– Correspondence of region size to # documents
– Overview (but also wanted zoom)
– Ease of jumping from one topic to another
– Multiple routes to topics
– Use of category and subcategory labels
185
Study 2 (cont.)
• Participants wanted:
– hierarchical organization
– other ordering of concepts (alphabetical)
– integration of browsing and search
– correspondence of color to meaning
– more meaningful labels
– labels at same level of abstraction
– fit more labels in the given space
– combined keyword and category search
– multiple category assignment (sports+entertain)
186
Study 3: NIRVE
NIRVE Interface by Cugini et al. 96. Each rectangle is a cluster. Larger clusters closer to the
“pole”. Similar clusters near one another. Opening a cluster causes a projection that shows
the titles.
187
Study 3
Visualization of search results: a comparative evaluation of text, 2D,
and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller,
Proceedings of SIGIR 99, Berkeley, CA, 1999.
• This study compared:
– 3D graphical clusters
– 2D graphical clusters
– textual clusters
• 15 participants, between-subject design
• Tasks
– Locate a particular document
– Locate and mark a particular document
– Locate a previously marked document
– Locate all clusters that discuss some topic
– List more frequently represented topics
188
Study 3
• Results (time to locate targets)
– Text clusters fastest
– 2D next
– 3D last
– With practice (6 sessions) 2D neared text results; 3D still
slower
– Computer experts were just as fast with 3D
• Certain tasks equally fast with 2D & text
– Find particular cluster
– Find an already-marked document
• But anything involving text (e.g., find title) much faster
with text.
– Spatial location rotated, so users lost context
• Helpful viz features
– Color coding (helped text too)
– Relative vertical locations
189
Summary: Visualizing Clusters
• Huge 2D maps may be inappropriate focus for
information retrieval
– cannot see what the documents are about
– space is difficult to browse for IR purposes
– (tough to visualize abstract concepts)
• Perhaps more suited for pattern discovery and
gist-like overviews
190
IR Infovis Meta-Analysis
(Empirical studies of information visualization:
a meta-analysis, Chen & Yu IJHCS 53(5),2000)
• Goal
– Find invariant underlying relations suggested
collectively by empirical findings from many different
studies
• Procedure
– Examine the literature of empirical infoviz studies
• 35 studies between 1991 and 2000
• 27 focused on information retrieval tasks
• But due to wide differences in the conduct of the
studies and the reporting of statistics, could use only 6
studies
191
IR Infovis Meta-Analysis
(Empirical studies of information visualization:
a meta-analysis, Chen & Yu IJHCS 53(5),2000)
• Conclusions:
– IR Infoviz studies not reported in a standard format
– Individual cognitive differences had the largest effect
• Especially on accuracy
• Somewhat on efficiency
– Holding cognitive abilities constant, users did better
with simpler visual-spatial interfaces
– The combined effect of visualization is not
statistically significant
192
So What Works?
• Yee, K-P et al., Faceted Metadata for Image Search and Browsing, to appear
in CHI 2003. Hearst, M, et al.; Chapter 10 of Modern Information
Retrieval, Baeza-Yates & Ribiero-Neto (Eds).
• Color highlighting of query terms in results listings
• Sorting of search results according to important criteria
(date, author)
• Grouping of results according to well-organized category
labels.
– Cha-cha
– Flamenco
• Only if highly accurate:
– Spelling correction/suggestions
– Simple relevance feedback (more-like-this)
– Certain types of term expansion
• Note: most don’t benefit from visualization!
193
Cha-Cha
• Chen, M., Hearst, M., Hong, J.,
and Lin, J. Cha-Cha: A System
for Organizing Intranet Search
Results in the Proceedings of
the 2nd USENIX Symposium on
Internet Technologies and
SYSTEMS (USITS), Boulder,
CO, October 11-14, 1999
194
Teoma: appears to combine
categories and clusters
(this version before it was bought by askjeeves)
195
Teoma: Now in prime time
196
Cat-a-Cone
Marti Hearst and Chandu Karadi, Cat-a-
Cone: An Interactive Interface for
Specifying Searches and Viewing
Retrieval Results using a Large Category
Hierarchy Proceedings of the 20th Annual
International ACM/SIGIR Conference
Philadelphia, PA, July 1997
197
Better to reduce the viz
• Flamenco – allows users to steer through the
category space
• Uses
– Dynamically-generated hypertext
– Color for distinguishing and grouping
– Careful layout and font choices
• Focused first on the users’ needs
198
Flamenco
199
Flamenco
200
Slide by Woodruff & Rosenholtz
Using Thumbnails to Search the Web
A. Woodruff, R. Rosenholtz, J. Morrison, A. Faulring, & P. Pirolli, A
comparison on the use of text summaries, plain thumbnails,
andenhanced thumbnails for web search tasks. JASIST, 53(2), 172-
185, 2002.; A. Woodruff, A. Faulring, R. Rosenholtz, J. Morrison, & P.
Pirolli,Using thumbnails to search the web. SIGCHI 2001
Design Goals
– Enhance features that help the user decide whether
document is relevant to their query
• Emphasize text that is relevant to query
– Text callouts
• Enlarge (make readable) text that might be
helpful in assessing page
– Enlarge headers
201
Slide by Woodruff & Rosenholtz
Text and Image Summaries
• Text summaries
– Lots of abstract, semantic information
• Image summaries (plain thumbnails)
– Layout, genre information
– Gist extraction faster than with text
• Benefits are complementary
• Create textually-enhanced thumbnails that
leverage the advantages of both text
summaries and plain thumbnails
202
Slide by Woodruff & Rosenholtz
Putting Callouts in a Separate
Visual Layer
• Transparency
• Occlusion
Junctions indicate the
occurrence of these
events.
203
Slide by Woodruff & Rosenholtz
Design Issues:
• Color Management
– Problems: Callouts need to be both readable and
draw attention
– Solution: Desaturate the background image, and use
a visual search model to choose appropriate colors
– Colors look like those in highlighter pens
• Resizing of Text
– Problem: We want to make certain text elements
readable, but not necessarily draw attention to them
– Solution: Modify the HTML before rendering the
thumbnail
204
Slide by Woodruff & Rosenholtz
Examples
205
Slide by Woodruff & Rosenholtz
Tasks
• Criteria: tasks that…
– Are representative of common queries
– Have result sets with different characteristics
– Vary in the number of correct answers
• 4 types of tasks
Picture: “Find a picture of a giraffe in the wild.”
Homepage: “Find Kern Holoman’s homepage.”
Side-effects: “Find at least three side effects of halcion.”
E-commerce: “Find an e-commerce site where you can buy
a DVD player. Identify the price in dollars.”
206
Slide by Woodruff & Rosenholtz
Conditions
• Text summary
– Page title
– Extracted text with
query terms in bold
– URL
• Plain thumbnail
• Enhanced thumbnail
– Readable H1, H2 tags
– Highlighted callouts of
query terms
– Reduced contrast level
in thumbnail
207
Slide by Woodruff & Rosenholtz
Collections of Summaries
• 100 results in random order
Approximately same number of each
summary type on a page
208
Slide by Woodruff & Rosenholtz
Method
• Procedure
– 6 practice tasks
– 3 questions for each of the 4 task types
• e.g., each participant would do one E-commerce
question using text, one E-commerce question using
plain thumbnails, and one E-commerce question using
enhanced thumbnails
– Questions blocked by type of summary
– WebLogger recorded user actions during browsing
– Semi-structured interview
• Participants
– 12 members of the PARC community
Entire process took about 75 minutes
18 questions, with 100 query results each
209
Slide by Woodruff & Rosenholtz
Results
• Average total search times, by task:
– Picture: 61 secs
– Homepage: 80 secs
– E-commerce: 64 secs
– Side effects: 128 secs
• Results pooled across all tasks:
– Subjects searched 20 seconds faster with enhanced
thumbnails than with plain
– Subjects searched 30 seconds faster with enhanced
thumbnails than with text summaries
– Mean search time overall was 83 seconds
210
Slide by Woodruff & Rosenholtz
Results
Normalized
total
search
time
(s)
211
Slide by Woodruff & Rosenholtz
Results: User Responses
• Participants preferred enhanced thumbnails
– 7/12 preferred overall
– 5/12 preferred for certain task types
• Enhanced thumbnails are intuitive and less
work than text or plain thumbnails
– One subject said searching for information with text
summaries did not seem hard until he used the
enhanced thumbnails.
• Many participants reported using genre
information, cues from the callouts, the
relationship between search terms, etc.
214
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
215
Comparing Approaches
216
Comparing 3 Commercial Systems
Alfred Kobsa, An Empirical Comparison of Three Commercial
Information Visualization Systems, INFOVIS'01.
217
Comparing 3 Commercial Systems
Eureka (InXight)
218
Comparing 3 Commercial Systems
InfoZoom (HumanIT)
219
Comparing 3 Commercial Systems
SpotFire
220
Slide by Alfred Kobsa
Infozoom Overview
•Presents data in three different views.
•Wide view shows data set in a table format.
•Compressed view packs the data set horizontally
to fit the window width.
•Overview mode has all attributes in ascending or
descending order and independent of each other.
221
InfoZoom Overview View
222
Slide by Alfred Kobsa
InfoZoom Overview View
223
InfoZoom Compressed Table View
224
InfoZoom Wide Table View
225
Slide by Kunal Garach
•Multidimensional data: three databases were used
•Anonymized data from a web based dating
service (60 records, 27 variables)
•Technical data of cars sold in 1970 – 82
(406 records, 10 variables)
•Data on the concentration of heavy metals in
Sweden (2298 records, 14 variables)
Datasets
226
Sample Questions
• Do more women than men want their partners
to have a higher education?
• What proportion of the men live in California?
• Do all people who think the bar is a good place
to meet a mate also believe in love at first
site?
• Do heavier cars have more horsepower?
• Which manufacturer produced the most cars in
1980?
• Is there a relationship between the
displacement and acceleration of a vehicle?
227
Slide by Kunal Garach
Experiment Design
• The experimenters generated 26 tasks from all
three data sets.
• 83 participants. Between-subjects design.
•Each was given one visualization system and all three
data sets.
• Type of visualization system was the independent
variable between them.
• 30 mins were given to solve the tasks of each data
set i.e 26 tasks in 90 mins.
228
Slide by Kunal Garach
Overall Results
• Mean task completion times:
• Infozoom users: 80 secs
• Spotfire users: 107 secs
• Eureka users: 110 secs
• Answer correctness:
• Infozoom users: 68%
• Spotfire users: 75%
• Eureka users: 71%
•Not a time-error tradeoff
•Spotfire more accurate only 6 questions
229
Slide by Kunal Garach
Eureka - problems
• Hidden labels: Labels are vertically aligned,
max 20 dimensions
• 3+ Attributes: Problems with queries
involving three or more attributes
• Correlation problems: Some participants had
trouble answering questions correctly that
involved correlations between two attributes.
230
Slide by Kunal Garach
Spotfire - problems
• Cognitive setup costs: Takes participants
considerable time to decide on the right
representation and to correctly set the coordinates
and parameters.
• Biased by scatterplot default: Though powerful,
many problems cannot be solved (well) with it.
231
Slide by Kunal Garach
Infozoom - problems
• Erroneous Correlations
• Overview mode has all attributes sorted
independent of each other
• Narrow row height in compressed view
• Participants did not use row expansion and
scatterplot charting function which shows
correlations more accurately
232
Geographic Questions
• Spotfire should have done better on these
•Which part of the country has the most copper
•Is there a relationship between the
concentration of vanadin and that of zinc?
•Is there a low-level chrome area that is high in
vanadim
•Spotfire was only better only for the last question
(out of 6 geographic ones)
233
Discussion
•Many studies of this kind use relatively simple
tasks that mirror the strengths of the system
•Find the one object with the maximum value for
a property
•Count how many of certain attributes there are
•This study looked at more complex, realistic, and
varied questions.
234
Discussion
•Success of a visualization system depends on many
factors:
• Properties supplied
•Spotfire doesn’t visualize as many dimensions
simultaneously
•Operations
•Zooming easy in InfoZoom; allows for drill-down
as well
•Zooming in Eureka causes context to be lost
•Column view in Eureka makes labels hard to see
235
Information Exploration
“Shootout”
• http://ivpr.cs.uml.edu/shootout/about.html
• Data Mining Applications
• One component focuses on visualization
236
Slide by Craig Rixford
Comparing Tree Views
• T. Barlow and P. Neville, Comparison of 2D Visualizations of
Hierarchies, INFOVIS’01.
• Problem
– Organization Chart is de facto standard for
visualizing decision trees. Is there a better compact
view of the tree for the overview window?
• Solution
– Two usability studies to determine which tree works
best.
237
Goal: Compact View of Tools
T. Barlow and P. Neville, Comparison of 2D Visualizations of Hierarchies, INFOVIS’01.
238
Slide by Craig Rixford
Decision Trees
• Each split constitutes a rule
or variable in predictive
model
• Begin Splitting into nodes
• Often hundreds of leaves
239
Slide by Craig Rixford
Decision Trees – What makes a
good visualization
• Uses
– For novice-helps them understand models
– Experts-initial evaluation of decisions tree without
looking at models
• Criteria for usability in study
– Ease of Interpretation of Topology (Parent Child
Sibling relations)
– Comparison of Node Size
– User preference
240
Slide by Craig Rixford
Different views examined in study
Org Chart Tree Ring Icicle Plot TreeMap
241
Slide by Craig Rixford
Usability Test 1:
• Users:
– 15 colleagues familiar with org chart but not others
• Tasks
– Is the tree binary or n-ary?
– Is the tree balanced or unbalanced?
– Find deepest common ancestor of two nodes
– Number of levels?
– Find three larges leaves (excluding org chart)
• Data: Created 8 trees for analysis
• Study Design
– Randomized order of tasks
– 4X5 design (almost)
– Timed task from appearance on screen until spacebar tap
242
Slide by Craig Rixford
Results
• Response Time
– TreeMap slowest; no statistical difference between
others
• Response Accuracy
– No significant difference
• User Preference
– Prefer icicle map and org chart (faster)
– Dislike tree map
243
Slide by Craig Rixford
Discussion
• Org chart served as benchmark
• Icicle plot favored amongst others
– Hypothesis: Same left to right / top to bottom
structure
• TreeRing did well
• TreeMap suffered from poor accuracy
– Offset of rectangles required because of off (which is
needed for selection)
244
Slide by Craig Rixford
Usability Test II: Tree implementation
• Three views:
– TreeMap eliminated from this round
• Tasks
– Node Description
• Four versions – select those nodes or leaves that meet
certain criteria
– Node Analysis:
• Memorize a highlighted node – find again after tree
redrawn in different position
245
Slide by Craig Rixford
Results
• Tree rings slower for description but fast and
accurate for memory tasks
• Perhaps due to unique geometric forms /
spatial clues
246
Slide by Craig Rixford
Conclusions
• TreeMap not useful for this type of task
• Org Chart/Icicle seem to be best overall
• TreeRing has merits for certain tasks
• Icicle chosen for implementation
– Best design considering Org Chart could not be used
for node size tasks
• However:
– Didn’t seem to actually do tests on trees as large as
the ones they describe as typical of datamining
247
Visualizing Conversations
248
Slide by Maggie Law & Vivien Petras
Text-Based Chat
249
Slide by Maggie Law & Vivien Petras
Chat Circles
Fernanda Viegas and Judith Donath, Chat Circles,
Proceedings of CHI'99.
250
Slide by Maggie Law & Vivien Petras
Chat Circles
• “Chat Circles is a graphical interface for synchronous
communication that uses abstract shapes to convey identity and
activity.”
• Each participant appears as a colored circle, which is
accompanied by the user name
• Location of circles will also identify participants (important for
many users having similar colors associated)
• Participants’ circles become larger when posting occurs (circle
adapts to text length)
• Circle appears bright when posting occurs
• Circles of inactive users fade in the background
251
Slide by Maggie Law & Vivien Petras
Chat Circles –
Conversational Groupings
• There is only ONE room in Chat Circles
• Groupings are achieved by moving closer to other
participants
• At any time, a participant can view all other
participants
• A participant can also detect interesting
conversations in different areas of the room by
looking at how many circles are gathered and how
often circles become larger
• Overview panel in Chat Circles II nice example of
focus + context
252
Slide by Maggie Law & Vivien Petras
Chat Circles History
253
Slide by Maggie Law & Vivien Petras
+ Easy to see “lurkers”
+ Sequence and size of
messages quickly visible
- Not very scalable
History Log Patterns
254
Slide by Maggie Law & Vivien Petras
History Log Patterns
+/- User-centric: only 1 point
of view represented
- Impossible to see all the
text at once – requires
individual mouse rollovers
- Easy to see “out of range”
conversations – but why
would you want to?
255
Agenda
• Introduction
• Visual Principles
• What Works?
• Visualization in Analysis & Problem Solving
• Visualizing Documents & Search
• Comparing Visualization Techniques
• Design Exercise
• Wrap-Up
256
Design Exercise
257
Design Exercise
• BreakingStory
(Reffel, Fitzpatrick, Ayedelott SIMS final project, at CHI 2003)
– Create an application that supplies a visualization for
trends over time in web-based news. The primary
purpose is to provide an overview, but it should also
be possible to view text from individual news sources
on specific days. Its goal is to inform, inspire, and
enlighten, and also to make people want to look
more deeply at the news.
258
Sample Solution
259
260
261
262
Another Approach: ThemeRiver
• S. Havre, B. Hetzler, L. Nowell, "ThemeRiver: Visualizing Theme Changes over
Time," Proc. IEEE Symposium on Information Visualization, 2000
263
Wrap-up: Guidelines for Success
264
Key Questions to Ask about a Viz
1. Is it for analysis or presentation?
2. What does it teach/show/elucidate?
3. What is the key contribution?
4. What are some compelling, useful examples?
5. Could it have been done more simply?
6. Have there been usability studies done?
What do they show?
265
Holistic Design Goals for
Information Visualization
– Tailor to the application and the domain
– Create highly interactive and integrated
systems
– Embed the visualization within a larger
application
– Provide alternative views
266
Visualization with a Light Touch: Orbitz.com
267
Visualization with a Light Touch:
Orbitz.com
268
Visualization with a Light Touch:
Orbitz.com
269
Visualization with a Light Touch: Orbitz.com
270
Visualization with a Light Touch: Orbitz.com
271
For more information
• My course:
• http://www.sims.berkeley.edu/courses/is247/s02/Lectures.html
• Atlas of Cyberspaces:
• http://www.geog.ucl.ac.uk/casa/martin/atlas/atlas.html
• Gallery of Data Visualization; The Best and Worst of Statistical
Graphics
• http://www.math.yorku.ca/SCS/Gallery/
• Tamara Munzner’s collection:
• http://graphics.stanford.edu/courses/cs348c-96-fall/resources.html
272
Thank you!

chi03-tutorial.ppt

  • 1.
    1 Information Visualization: Principles, Promise,and Pragmatics Marti Hearst CHI 2003 Tutorial
  • 2.
    2 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 3.
    3 Introduction • Goals ofInformation Visualization • Case Study: The Journey of the TreeMap • Key Questions
  • 4.
    4 What is InformationVisualization? Visualize: to form a mental image or vision of … Visualize: to imagine or remember as if actually seeing. American Heritage dictionary, Concise Oxford dictionary
  • 5.
    5 What is InformationVisualization? “Transformation of the symbolic into the geometric” (McCormick et al., 1987) “... finding the artificial memory that best supports our natural means of perception.'' (Bertin, 1983) The depiction of information using spatial or graphical representations, to facilitate comparison, pattern recognition, change detection, and other cognitive skills by making use of the visual system.
  • 6.
    6 Information Visualization • Problem: –HUGE Datasets: How to understand them? • Solution – Take better advantage of human perceptual system – Convert information into a graphical representation. • Issues – How to convert abstract information into graphical form? – Do visualizations do a better job than other methods?
  • 7.
  • 8.
    8 Image from mapquest.com ThePower of Visualization 1. Start out going Southwest on ELLSWORTH AVE Towards BROADWAY by turning right. 2: Turn RIGHT onto BROADWAY. 3. Turn RIGHT onto QUINCY ST. 4. Turn LEFT onto CAMBRIDGE ST. 5. Turn SLIGHT RIGHT onto MASSACHUSETTS AVE. 6. Turn RIGHT onto RUSSELL ST.
  • 9.
    9 The Power ofVisualization Line drawing tool by Maneesh Agrawala http://graphics.stanford.edu/~maneesh/
  • 10.
    10 Visualization Success Story Mystery:what is causing a cholera epidemic in London in 1854?
  • 11.
    11 Visualization Success Story FromVisual Explanations by Edward Tufte, Graphics Press, 1997 Illustration of John Snow’s deduction that a cholera epidemic was caused by a bad water pump, circa 1854. Horizontal lines indicate location of deaths.
  • 12.
    12 Visualization Success Story FromVisual Explanations by Edward Tufte, Graphics Press, 1997 Illustration of John Snow’s deduction that a cholera epidemic was caused by a bad water pump, circa 1854. Horizontal lines indicate location of deaths.
  • 13.
    13 Purposes of InformationVisualization To help: Explore Calculate Communicate Decorate
  • 14.
    14 Two Different PrimaryGoals: Two Different Types of Viz Explore/Calculate Analyze Reason about Information Communicate Explain Make Decisions Reason about Information
  • 15.
    15 Goals of InformationVisualization More specifically, visualization should: – Make large datasets coherent (Present huge amounts of information compactly) – Present information from various viewpoints – Present information at several levels of detail (from overviews to fine structure) – Support visual comparisons – Tell stories about the data
  • 16.
    16 Why Visualization? Use theeye for pattern recognition; people are good at scanning recognizing remembering images Graphical elements facilitate comparisons via length shape orientation texture Animation shows changes across time Color helps make distinctions Aesthetics make the process appealing
  • 17.
    18 The Need forCritical Analysis • We see many creative ideas, but they often fail in practice • The hard part: how to apply it judiciously – Inventors usually do not accurately predict how their invention will be used • This tutorial will emphasize – Getting past the coolness factor – Examining usability studies
  • 18.
    19 Case Study: The Journeyof the TreeMap • The TreeMap (Johnson & Shneiderman ‘91) • Idea: – Show a hierarchy as a 2D layout – Fill up the space with rectangles representing objects – Size on screen indicates relative size of underlying objects.
  • 19.
  • 20.
    21 Treemap Problems • Toodisorderly – What does adjacency mean? – Aspect ratios uncontrolled leads to lots of skinny boxes that clutter • Color not used appropriately – In fact, is meaningless here • Wrong application – Don’t need all this to just see the largest files in the OS
  • 21.
    22 Successful Application ofTreemaps • Think more about the use – Break into meaningful groups – Fix these into a useful aspect ratio • Use visual properties properly – Use color to distinguish meaningfully • Use only two colors: – Can then distinguish one thing from another • When exact numbers aren’t very important • Provide excellent interactivity – Access to the real data – Makes it into a useful tool
  • 22.
  • 23.
    24 A Good Useof TreeMaps and Interactivity www.smartmoney.com/marketmap
  • 24.
  • 25.
    26 Analysis vs. Communication •MarketMap’s use of TreeMaps allows for sophisticated analysis • Peets’ use of TreeMaps is more for presentation and communication • This is a key contrast
  • 26.
    27 Open Issues • Doesvisualization help? – The jury is still out – Still supplemental at best for text collections • A correlation with spatial ability • Learning effects: with practice ability on visual display begins to equal that of text • Does visualization sell? – Jury is still out on this one too! • This is a hot area! More ideas will appear!
  • 27.
    28 Key Questions toAsk about a Viz 1. What does it teach/show/elucidate? 2. What is the key contribution? 3. What are some compelling, useful examples? 4. Could it have been done more simply? 5. Have there been usability studies done? What do they show?
  • 28.
    29 What we arenot covering • Scientific visualization • Statistics • Cartography (maps) • Education • Games • Computer graphics in general • Computational geometry
  • 29.
    30 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 30.
  • 31.
    32 Visual Principles – Typesof Graphs – Pre-attentive Properties – Relative Expressiveness of Visual Cues – Visual Illusions – Tufte’s notions • Graphical Excellence • Data-Ink Ratio Maximization • How to Lie with Visualization
  • 32.
    33 References for VisualPrinciples • Kosslyn: Types of Visual Representations • Lohse et al: How do people perceive common graphic displays • Bertin, MacKinlay: Perceptual properties and visual features • Tufte/Wainer: How to mislead with graphs
  • 33.
    34 A Graph is:(Kosslyn) • A visual display that illustrates one or more relationships among entities • A shorthand way to present information • Allows a trend, pattern, or comparison to be easily apprehended
  • 34.
    35 Types of SymbolicDisplays (Kosslyn 89) • Graphs • Charts • Maps • Diagrams Type name here Type title here Type name here Type title here Type name here Type title here Type name here Type title here
  • 35.
    Types of SymbolicDisplays • Graphs – at least two scales required – values associated by a symmetric “paired with” relation • Examples: scatter-plot, bar-chart, layer-graph
  • 36.
    Types of SymbolicDisplays Charts – discrete relations among discrete entities – structure relates entities to one another – lines and relative position serve as links Examples: family tree flow chart network diagram
  • 37.
    Types of SymbolicDisplays • Maps – internal relations determined (in part) by the spatial relations of what is pictured – labels paired with locations Examples: map of census data topographic maps From www.thehighsierra.com
  • 38.
    Types of SymbolicDisplays Diagrams – schematic pictures of objects or entities – parts are symbolic (unlike photographs) • how-to illustrations • figures in a manual From Glietman, Henry. Psychology. W.W. Norton and Company, Inc. New York, 1995
  • 39.
    Anatomy of aGraph (Kosslyn 89) • Framework – sets the stage – kinds of measurements, scale, ... • Content – marks – point symbols, lines, areas, bars, … • Labels – title, axes, tic marks, ...
  • 40.
    Basic Types ofData • Nominal (qualitative) – (no inherent order) – city names, types of diseases, ... • Ordinal (qualitative) – (ordered, but not at measurable intervals) – first, second, third, … – cold, warm, hot • Interval (quantitative) – list of integers or reals
  • 41.
    Common Graph Types lengthof page length of access URL # of accesses length of access # of accesses length of access length of page 0 5 10 15 20 25 30 35 40 45 short medium long very long days # of accesses url 1 url 2 url 3 url 4 url 5 url 6 url 7 # of accesses
  • 42.
    Combining Data Typesin Graphs Nominal Nominal Nominal Ordinal Nominal Interval Ordinal Ordinal Ordinal Interval Interval Interval Examples?
  • 43.
    Scatter Plots • Qualitativelydetermine if variables – are highly correlated • linear mapping between horizontal & vertical axes – have low correlation • spherical, rectangular, or irregular distributions – have a nonlinear relationship • a curvature in the pattern of plotted points • Place points of interest in context – color representing special entities
  • 44.
    When to usewhich type? • Line graph – x-axis requires quantitative variable – Variables have contiguous values – familiar/conventional ordering among ordinals • Bar graph – comparison of relative point values • Scatter plot – convey overall impression of relationship between two variables • Pie Chart? – Emphasizing differences in proportion among a few numbers
  • 45.
    Classifying Visual Representations Lohse, GL; Biolsi, K; Walker, N and H H Rueter, A Classification of Visual Representations CACM, Vol. 37, No. 12, pp 36-49, 1994 Participants sorted 60 items into categories Other participants assigned labels from Likert scales Experimenters clustered the results various ways.
  • 46.
    Subset of ExampleVisual Representations From Lohse et al. 94
  • 47.
    Subset of ExampleVisual Representations From Lohse et al. 94
  • 48.
    Likert Scales (and percentageof variance explained) 16.0 emphasizes whole – parts 11.3 spatial – nonspatial 10.6 static structure – dynamic structure 10.5 continuous – discrete 10.3 attractive – unattractive 10.1 nontemporal – temporal 9.9 concrete – abstract 9.6 hard to understand – easy 9.5 nonnumeric – numeric 2.2 conveys a lot of info – conveys little
  • 49.
    Experimentally Motivated Classification (Lohseet al. 94) • Graphs • Tables (numerical) • Tables (graphical) • Charts (time) • Charts (network) • Diagrams (structure) • Diagrams (network) • Maps • Cartograms • Icons • Pictures
  • 50.
    Interesting Findings Lohse etal. 94 • Photorealistic images were least informative – Echos results in icon studies – better to use less complex, more schematic images • Graphs and tables are the most self-similar categories – Results in the literature comparing these are inconclusive • Cartograms were hard to understand – Echos other results – better to put points into a framed rectangle to aid spatial perception • Temporal data more difficult to show than cyclic data – Recommend using animation for temporal data
  • 51.
    Visual Properties • PreattentiveProcessing • Accuracy of Interpretation of Visual Properties • Illusions and the Relation to Graphical Integrity All Preattentive Processing figures from Healey 97 http://www.csc.ncsu.edu/faculty/healey/PP/PP.html
  • 52.
    Preattentive Processing • Alimited set of visual properties are processed preattentively – (without need for focusing attention). • This is important for design of visualizations – what can be perceived immediately – what properties are good discriminators – what can mislead viewers
  • 53.
    Example: Color Selection Viewercan rapidly and accurately determine whether the target (red circle) is present or absent. Difference detected in color.
  • 54.
    Example: Shape Selection Viewercan rapidly and accurately determine whether the target (red circle) is present or absent. Difference detected in form (curvature)
  • 55.
    Pre-attentive Processing • <200 - 250ms qualifies as pre-attentive – eye movements take at least 200ms – yet certain processing can be done very quickly, implying low-level processing in parallel • If a decision takes a fixed amount of time regardless of the number of distractors, it is considered to be preattentive.
  • 56.
    Example: Conjunction of Features Viewercannot rapidly and accurately determine whether the target (red circle) is present or absent when target has two or more features, each of which are present in the distractors. Viewer must search sequentially. All Preattentive Processing figures from Healey 97 http://www.csc.ncsu.edu/faculty/healey/PP/PP.html
  • 57.
    Example: Emergent Features Targethas a unique feature with respect to distractors (open sides) and so the group can be detected preattentively.
  • 58.
    Example: Emergent Features Targetdoes not have a unique feature with respect to distractors and so the group cannot be detected preattentively.
  • 59.
    Asymmetric and Graded PreattentiveProperties • Some properties are asymmetric – a sloped line among vertical lines is preattentive – a vertical line among sloped ones is not • Some properties have a gradation – some more easily discriminated among than others
  • 60.
    Use Grouping ofWell-Chosen Shapes for Displaying Multivariate Data
  • 61.
    SUBJECT PUNCHED QUICKLYOXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC
  • 62.
    SUBJECT PUNCHED QUICKLYOXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM GOVERNS PRECISE EXAMPLE MERCURY SNREVOG ESICERP ELPMAXE YRUCREM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC SUBJECT PUNCHED QUICKLY OXIDIZED TCEJBUS DEHCNUP YLKCIUQ DEZIDIXO CERTAIN QUICKLY PUNCHED METHODS NIATREC YLKCIUQ DEHCNUP SDOHTEM SCIENCE ENGLISH RECORDS COLUMNS ECNEICS HSILGNE SDROCER SNMULOC Text NOT Preattentive
  • 63.
    Preattentive Visual Properties (Healey97) length Triesman & Gormican [1988] width Julesz [1985] size Triesman & Gelade [1980] curvature Triesman & Gormican [1988] number Julesz [1985]; Trick & Pylyshyn [1994] terminators Julesz & Bergen [1983] intersection Julesz & Bergen [1983] closure Enns [1986]; Triesman & Souther [1985] colour (hue) Nagy & Sanchez [1990, 1992]; D'Zmura [1991] Kawai et al. [1995]; Bauer et al. [1996] intensity Beck et al. [1983]; Triesman & Gormican [1988] flicker Julesz [1971] direction of motion Nakayama & Silverman [1986]; Driver & McLeod [1992] binocular lustre Wolfe & Franzel [1988] stereoscopic depth Nakayama & Silverman [1986] 3-D depth cues Enns [1990] lighting direction Enns [1990]
  • 64.
    Gestalt Properties • Gestalt:form or configuration • Idea: forms or patterns transcend the stimuli used to create them. – Why do patterns emerge? – Under what circumstances? Why perceive pairs vs. triplets?
  • 65.
    Gestalt Laws ofPerceptual Organization (Kaufman 74) • Figure and Ground – Escher illustrations are good examples – Vase/Face contrast • Subjective Contour
  • 66.
    More Gestalt Laws •Law of Proximity – Stimulus elements that are close together will be perceived as a group • Law of Similarity – like the preattentive processing examples • Law of Common Fate – like preattentive motion property • move a subset of objects among similar ones and they will be perceived as a group
  • 67.
    Which Properties are Appropriatefor Which Information Types?
  • 68.
    Accuracy Ranking ofQuantitative Perceptual Tasks Estimated; only pairwise comparisons have been validated (Mackinlay 88 from Cleveland & McGill)
  • 69.
    Interpretations of VisualProperties Some properties can be discriminated more accurately but don’t have intrinsic meaning (Senay & Ingatious 97, Kosslyn, others) – Density (Greyscale) Darker -> More – Size / Length / Area Larger -> More – Position Leftmost -> first, Topmost -> first – Hue ??? no intrinsic meaning – Slope ??? no intrinsic meaning
  • 70.
    QUANTITATIVE ORDINAL NOMINAL PositionPosition Position Length Density Color Hue Angle Color Saturation Texture Slope Color Hue Connection Area Texture Containment Volume Connection Density Density Containment Color Saturation Color Saturation Length Shape Color Hue Angle Length Ranking of Applicability of Properties for Different Data Types (Mackinlay 88, Not Empirically Verified)
  • 73.
    Color Purposes • Callattention to specific items • Distinguish between classes of items – Increases the number of dimensions for encoding • Increase the appeal of the visualization
  • 74.
    Using Color • Proceedwith caution – Less is more – Representing magnitude is tricky • Examples – Red-orange-yellow-white • Works for costs • Maybe because people are very experienced at reasoning shrewdly according to cost – Green-light green-light brown-dark brown-grey-white works for atlases – Grayscale is unambiguous but has limited range
  • 75.
    Visual Illusions • Peopledon’t perceive length, area, angle, brightness they way they “should”. • Some illusions have been reclassified as systematic perceptual errors – e.g., brightness contrasts (grey square on white background vs. on black background) – partly due to increase in our understanding of the relevant parts of the visual system • Nevertheless, the visual system does some really unexpected things.
  • 76.
    Illusions of LinearExtent • Mueller-Lyon (off by 25-30%) • Horizontal-Vertical
  • 77.
    Illusions of Area •Delboeuf Illusion • Height of 4-story building overestimated by approximately 25%
  • 78.
    What are goodguidelines for Infoviz? • Use graphics appropriately – Don’t use images gratuitously – Don’t lie with graphics! • Link to original data – Don’t conflate area with other information • E.g., use area in map to imply amount • Make it interactive (feedback) – Brushing and linking – Multiple views – Overview + details • Match mental models
  • 79.
    80 Tufte • Principles ofGraphical Excellence – Graphical excellence is • the well-designed presentation of interesting data – a matter of substance, of statistics, and of design • consists of complex ideas communicated with clarity, precision and efficiency • is that which gives to the viewer the greates number of ideas in the shortest time with the least ink in the smallest space • requires telling the truth about the data.
  • 80.
    81 Tufte’s Notion ofData Ink Maximization • What is the main idea? – draw viewers attention to the substance of the graphic – the role of redundancy – principles of editing and redesign • What’s wrong with this? What is he really getting at?
  • 81.
    82 Tufte Principle Maximize thedata-ink ratio: data ink Data-ink ratio = -------------------------- total ink used in graphic Avoid “chart junk”
  • 82.
    83 Tufte Principles • Usemultifunctioning graphical elements • Use small multiples • Show mechanism, process, dynamics, and causality • High data density – Number of items/area of graphic – This is controversial • White space thought to contribute to good visual design • Tufte’s book itself has lots of white space
  • 83.
    84 Tufte’s Graphical Integrity •Some lapses intentional, some not • Lie Factor = size of effect in graph size of effect in data • Misleading uses of area • Misleading uses of perspective • Leaving out important context • Lack of taste and aesthetics
  • 84.
    85 From Tim Craven’sLIS 504 course http://instruct.uwo.ca/fim-lis/504/504gra.htm#data-ink_ratio
  • 85.
    86 How to Exaggeratewith Graphs from Tufte ’83 “Lie factor” = 2.8
  • 86.
    87 How to Exaggeratewith Graphs from Tufte ’83 Error: Shrinking along both dimensions
  • 87.
    88 Howard Wainer How toDisplay Data Badly (Video) http://www.dartmouth.edu/~chance/ChanceLecture/AudioVideo.html
  • 88.
    89 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 89.
  • 90.
    91 Promising Techniques &Approaches • Perceptual Techniques – Animation – Grouping / Gestalt principles – Using size to indicate quantity – Color for Accent, Distinction, Selection • NOT FOR QUANTITY!!!! • General Approaches – Standard Techniques • Graphs, bar charts, tables – Brushing and Linking – Providing Multiple Views and Models – Aesthetics!
  • 91.
    92 Standard Techniques • It’soften hard to beat: – Line graphs, bar charts – Scatterplots (or Scatterplot Matrix) – Tables • A Darwinian view of visualizations: – Only the fittest survive – We are in a period of great experimentation; eventually it will be clear what works and what dies out. • A bright spot: – Enhancing the old techniques with interactivity – Example: Spotfire • Adds interactivity, color highlighting, zooming to scatterplots – Example: TableLens / Eureka • Adds interactivity and length cues to tables
  • 92.
  • 93.
  • 94.
    Brushing and Linking •Interactive technique – Highlighting – Brushing and Linking • At least two things must be linked together to allow for brushing – select a subset of points – see the role played by this subset of points in one or more other views • Example systems – Graham Will’s EDV system – Ahlberg & Sheiderman’s IVEE (Spotfire)
  • 95.
    Linking types ofassist behavior to position played (from Eick & Wills 95)
  • 96.
    Baseball data: Scatterplots andhistograms and bars (from Eick & Wills 95) select high salaries avg career HRs vs avg career hits (batting ability) avg assists vs avg putouts (fielding ability) how long in majors distribution of positions played
  • 97.
    What was learnedfrom interaction with this baseball data? – Seems impossible to earn a high salary in the first three years – High salaried players have a bimodal distribution (peaking around 7 & 13 yrs) – Hits/Year a better indicator of salary than HR/Year – High paid outlier with low HR and medium hits/year. Reason: person is player-coach – There seem to be two differentiated groups in the put-outs/assists category (but not correlated with salary) Why?
  • 98.
    99 Slide by SaifonObromsook & Linda Harjono Animation • “The quality or condition of being alive, active, spirited, or vigorous” (dictionary.com) • “A dynamic visual statement that evolves through movement or change in the display” • “… creating the illusion of change by rapidly displaying a series of single frames” (Roncarelli 1988).
  • 99.
    100 Slide by SaifonObromsook & Linda Harjono We Use Animation to… • Tell stories / scenarios: cartoons • Illustrate dynamic process / simulation • Create a character / an agent • Navigate through virtual spaces • Draw attention • Delight
  • 100.
    101 Slide by SaifonObromsook & Linda Harjono Cartoon Animation Principles • Chang & Unger ‘93 • Solidity (squash and stretch) – Solid drawing – Motion blur – Dissolves • Exaggeration – Anticipation – Follow through • Reinforcement – Slow in and slow out – Arcs – Follow through
  • 101.
    102 Slide by SaifonObromsook & Linda Harjono Why Cartoon-Style Animation? • Cartoons’ theatricality is powerful in communicating to the user. • Cartoons can make UI engage the user into its world. • The medium of cartoon animation is like that of graphic computers.
  • 102.
    103 Application using Animation: Gnutellavision •Visualization of Peer-to-Peer Network – Hosts (with color for status and size for number of files) – Nodes with closer network distance from focus on inner rings – Queries shown; can trace queries • Gnutellavision as exploratory tool – Very few hosts share many files – Uneven propagation of queries – Qualitative assessment of queries (simple)
  • 103.
  • 104.
    105 Animation in Gnutellavision Goalof animation is to help maintain context of nodes and general orientation of user during refocus • Transition Paths – Linear interpolation of polar coordinates – Node moves in arc not straight line – Moves along circle if not changing levels (like great circles on earth) – Spirals in or out to next ring
  • 105.
    106 Animation (continued) • Transitionconstraints – Orientation of transition to minimize rotational travel – (Move former parent away from new focus in same orientation) – Avoid cross-over of edges – (to allow users to keep track of which is which) • Animation timing – Slow in Slow out timing (allows users to better track movement)
  • 106.
  • 107.
  • 108.
    109 Usability Testing • Ingeneral, users appreciated the subtleties added to the general method when the number of nodes increased. • Perhaps the most interesting result is that most people preferred rectangular movement for the small graph and polar coordinate movement for the large one. Overall Preference of Users No Features All Features Small Graph 5 5 Large Graph 1 9
  • 109.
    110 Hyperbolic Tree • AFocus+Context Technique Based on Hyperbolic Geometry for Visualizing Large Hierarchies (1995) John Lamping, Ramana Rao, Peter Pirolli Proc. ACM Conf. Human Factors in Computing Systems, CHI • Also uses animation • Tree-based layout; leaves stretch to infinity • Only a few labels can be seen at a time
  • 110.
  • 111.
  • 112.
  • 113.
  • 114.
    115 Issues • Displaying text –The size of the text • Works good for small things like directories • Not so good for URLs • Only a portion of the data can be seen in the focus at one time • Only works for certain types of data - Hierarchical • Not clear if it is actually useful for anything.
  • 115.
    116 Slide by SaifonObromsook & Linda Harjono Animating Algorithms • Kehoe, Stasko, and Taylor, “Rethinking Evaluation of Algorithm Animations as Learning Aids” • Why previous studies present no benefits: – No or limited benefits from particular animations – Benefits are not captured in measurements – Design of experiments hides the benefits • Methods for this study: – Combination of qualitative & quantitative – More flexible setting – Metrics: score for each type of questions, time used, usage of materials, qualitative data from observations & interviews
  • 116.
  • 117.
    118 Slide by SaifonObromsook & Linda Harjono Findings • Value of animation is more apparent in interactive situations • Most useful to learn procedural operations • Makes subject more accessible & less intimidating  increase motivation
  • 118.
    119 What Isn’t Working? Theexisting studies indicate that we don’t yet know how to make the following work well for every-day tasks: – Pan-and-Zoom – 3D Navigation – Node-and-link representations of concept spaces
  • 119.
    120 Zoom, Overview +Detail • An exception, possibly: – Benjamin B. Bederson: PhotoMesa: a zoomable image browser using quantum treemaps and bubblemaps. UIST 2001: 71-80
  • 120.
    121 Overview + Detail •K. Hornbaek et al., Navigation patterns and Usability of Zoomable User Interfaces with and without an Overview, ACM TOCHI, 9(4), December 2002.
  • 121.
    122 Overview + Detail •K. Hornbaek et al., Navigation patterns and Usability of Zoomable User Interfaces with and without an Overview, ACM TOCHI, 9(4), December 2002. • A study on integrating Overview + Detail on a Map search task – Incorporating panning & zooming as well. – They note that panning & zooming does not do well in most studies. • Results seem to be – Subjectively, users prefer to have a linked overview – But they aren’t necessarily faster or more effective using it – Well-constructed representation of the underlying data may be more important. • More research needed as each study seems to turn up different results, sensitive to underlying test set.
  • 122.
    123 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 123.
  • 124.
    125 Problem Solving • ADetective Tool for Multidimensional Data – Inselberg on using Parallel Coordinates • Analyzing Web Clickstream Data – Brainerd & Becker, Waterson et al. • Information Visualization for Pattern Detection – Carlis & Konstan on Periodic Data • Visualization vs. Analysis – Comments by Wesley Johnson of Chevron
  • 125.
    126 Multidimensional Detective A. Inselberg,Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (InfoVis '97), 1997.
  • 126.
    127 A Detective Story A.Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (InfoVis '97), 1997 Inselberg’s Principles for analysis using visualizations: 1. Do not let the picture scare you 2. Understand your objectives – Use them to obtain visual cues 3. Carefully scrutinize the picture 4. Test your assumptions, especially the “I am really sure of’s” 5. You can’t be unlucky all the time!
  • 127.
    128 A Detective Story A.Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (InfoVis '97), 1997 • The Dataset: – Production data for 473 batches of a VLSI chip – 16 process parameters – The yield: % of produced chips that are useful • X1 – The quality of the produced chips (speed) • X2 – 10 types of defects (zero defects shown at top) • X3 … X12 – 4 physical parameters • X13 … X16 • The Objective: – Raise the yield (X1) and maintain high quality (X2)
  • 128.
    129 Multidimensional Detective A. Inselberg,Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (InfoVis '97), 1997. Do Not Let the Picture Scare You!!
  • 129.
    130 Multidimensional Detective • Eachline represents the values for one batch of chips • This figure shows what happens when only those batches with both high X1 and high X2 are chosen • Notice the separation in values at X15 • Also, some batches with few X3 defects are not in this high-yield/high-quality group.
  • 130.
    131 Multidimensional Detective • Nowlook for batches which have nearly zero defects. – For 9 out of 10 defect categories • Most of these have low yields • Surprising because we know from first diagram that some defects are ok. • Go back to first diagram, looking at defect categories • Notice that X6 behaves differently than the rest • Allow two defects, where one defect in X6 • This results in the very best batch appearing
  • 131.
    132 Multidimensional Detective • Fig5 and 6 show that high yield batches don’t have non-zero values for defects of type X3 and X6 – Don’t believe your assumptions … • Looking now at X15 we see the separation is important – Lower values of this property end up in the better yield batches
  • 132.
    133 Automated Analysis A. Inselberg,Automated Knowledge Discovery using Parallel Coordinates, INFOVIS ‘99
  • 133.
    134 Slide by WayneKao Case Study: E-Commerce Clickstream Visualization • Brainerd & Becker, IEEE Infovis 2001 • Aggregate nodes using an icon (e.g. all the checkout pages) • Edges represent transitions – Wider means more transitions
  • 134.
    135 Slide by WayneKao Customer Segments • Collect – Clickstream – Purchase history – Demographic data • Associates customer data with their clickstream • Different color for each customer segment
  • 135.
    136 Slide by WayneKao Layout • Aggregation based on file system path
  • 136.
    137 Slide by WayneKao Initial Findings • Gender shopping differences
  • 137.
    138 Slide by WayneKao Initial Findings (cont) • Checkout process analysis • Newsletter hurting sales
  • 138.
    139 Slide by WayneKao WebQuilt Interactive, zoomable directed graph • Nodes = web pages • Edges = aggregate traffic between pages Waterson et al.,``What Did They Do? Understanding Clickstreams with the WebQuilt Visualization System.'' in AVI 2002.
  • 139.
    140 Slide by WayneKao Directed graph • Nodes: visited pages – Color marks entry and exit nodes • Arrows: traversed links – Thicker: more heavily traversed – Color • Red/yellow: Time spend before clicking • Blue: optimal path chosen by designer
  • 140.
  • 141.
    142 Slide by WayneKao Pilot Usability Study • Edmunds.com PDA web site • Visor Handspring equipped with a OmniSky wireless modem • 10 users asked to find… – Anti-lock brake information on the latest Nissan Sentra model – The Nissan dealer closest to them.
  • 142.
    143 Slide by WayneKao In the Lab vs. Out in the Wild Comparing in-lab usability testing with WebQuilt remote usability testing • 5 users were tested in the lab • 5 were given the device and asked to perform the task at their convenience • All task directions, demographic data, and follow up questionnaire data was presented and collected in web forms as part of the WebQuilt testing framework.
  • 143.
  • 144.
  • 145.
  • 146.
    147 Slide by WayneKao Browser Device Interact before load (3) No forward button (2) Difficulty with input in questionnaire (3) Difficulty scrolling (2) Device errors unrelated to testing (1) Tried writing on screen (0) Site Design Test Design  Falsely completed task (4)  Long download times (4)  Ping-pong behavior (3)  Interact before load (3)  Too much scrolling (2)  Save address functionality not clear (1)  Back button navigation (0)  Would like more features (0)  Finds site useful (0)  Falsely completed task (4)  Difficulty remembering task description (3)  Difficulty with input in questionnaire (3)  Questionnaire wording problems (3)  Forgot how to end task (1)  Confusing task description (1) Findings
  • 147.
    148 Slide by WayneKao Findings • WebQuilt methodology is promising for uncovering site design related issues. • 1/3 of the issues were device or browser related. • Browser and device issues can not be captured automatically with WebQuilt unless they cause an interaction with the server • Can be revealed via the questionnaire data.
  • 148.
    149 Visualization for Analysis •Carlis & Konstan, UIST 1998 • Problem: data that is both periodic and serial – Time students spend on different activities – Tree growth patterns • Time: which year • Period: yearly – Multi-day races such as the Tour de France – Calendars arbitrarily wrap around at end of month – Octaves in music • How to find patterns along both dimensions?
  • 149.
    150 Analyzing Complex PeriodicData Carlis & Konstan, UIST 1998.
  • 150.
    151 Analyzing Complex PeriodicData Carlis & Konstan, UIST 1998. •Consumption values for each month appear as spikes •Each food has its own color •Boundary line (in black) shows when season begins/ends
  • 151.
  • 152.
    153 Visualization vs. Analysis? •Applications to data mining and data discovery. • Wesley Johnson ’02: – Visualization tools are helpful for exploring hunches and presenting results • Examples: scatterplots – They are the WRONG primary tool when the goal is to find a good classifier model in a complex situation. – Need: • Solid insight into the domain and problem • Tools that visualize several alternative models. • Emphasize “model visualization” rather than “data visualization”
  • 153.
    154 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 154.
  • 155.
    156 Documents and Search •Why Visualize Text? • Why Text is Tough • Visualizing Concept Spaces – Clusters – Category Hierarchies • Visualizing Retrieval Results • Usability Study Meta-Analysis
  • 156.
    157 Why Visualize Text? •To help with Information Retrieval – give an overview of a collection – show user what aspects of their interests are present in a collection – help user understand why documents retrieved as a result of a query • Text Data Mining – Mainly clustering & nodes-and-links • Software Engineering – not really text, but has some similar properties
  • 157.
    158 Why Text isTough • Text is not pre-attentive • Text consists of abstract concepts – which are difficult to visualize • Text represents similar concepts in many different ways – space ship, flying saucer, UFO, figment of imagination • Text has very high dimensionality – Tens or hundreds of thousands of features – Many subsets can be combined together
  • 158.
    159 Why Text isTough As the man walks the cavorting dog, thoughts arrive unbidden of the previous spring, so unlike this one, in which walking was marching and dogs were baleful sentinals outside unjust halls. How do we visualize this?
  • 159.
    160 Why Text isTough • Abstract concepts are difficult to visualize • Combinations of abstract concepts are even more difficult to visualize – time – shades of meaning – social and psychological concepts – causal relationships
  • 160.
    161 Why Text isTough • Language only hints at meaning • Most meaning of text lies within our minds and common understanding – “How much is that doggy in the window?” • how much: social system of barter and trade (not the size of the dog) • “doggy” implies childlike, plaintive, probably cannot do the purchasing on their own • “in the window” implies behind a store window, not really inside a window, requires notion of window shopping
  • 161.
    162 Why Text isEasy • Text is highly redundant – When you have lots of it – Pretty much any simple technique can pull out phrases that seem to characterize a document • Instant summary: – Extract the most frequent words from a text – Remove the most common English words • People are very good at attributing meaning to lists of otherwise unrelated words
  • 162.
    163 Guess the Text: 10PEOPLE 10 ALL 9 STATES 9 LAWS 8 NEW 7 RIGHT 7 GEORGE 6 WILLIAM 6 THOMAS 6 JOHN 6 GOVERNMENT 5 TIME 5 POWERS 5 COLONIES 4 LARGE 4 INDEPENDENT 4 FREE 4 DECLARATION 4 ASSENT 3 WORLD 3 WAR 3 USURPATIONS 3 UNITED 3 SEAS 3 RIGHTS
  • 163.
    166 Visualization of TextCollections • How to summarize the contents of hundreds, thousands, tens of thousands of texts? • Many have proposed clustering the words and showing points of light in a 2D or 3D space. • Examples – Showing docs/collections as a word space – Showing retrieval results as points in word space
  • 164.
  • 165.
  • 166.
  • 167.
  • 168.
    171 Example: Themescapes (Wise etal. 95) Themescapes (Wise et al. 95)
  • 169.
  • 170.
  • 171.
    175 Clustering for Collection Overviews •Two main steps – cluster the documents according to the words they have in common – map the cluster representation onto a (interactive) 2D or 3D representation • Since text has tens of thousands of features – the mapping to 2D loses a tremendous amount of information – only very coarse themes are detected
  • 172.
    177 Scatter/Gather Cutting, Pedersen, Tukey& Karger 92, 93, Hearst & Pedersen 95
  • 173.
    181 How Useful isCollection Cluster Visualization for Search? Three studies find negative results
  • 174.
    182 Study 1 Kleiboemer, Lazear,and Pedersen. Tailoring a retrieval system for naive users. In Proc. of the 5th Annual Symposium on Document Analysis and Information Retrieval, 1996 • This study compared – a system with 2D graphical clusters – a system with 3D graphical clusters – a system that shows textual clusters • Novice users • Only textual clusters were helpful (and they were difficult to use well)
  • 175.
    183 Study 2: KohonenFeature Maps H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) • Comparison: Kohonen Map and Yahoo • Task: – “Window shop” for interesting home page – Repeat with other interface • Results: – Starting with map could repeat in Yahoo (8/11) – Starting with Yahoo unable to repeat in map (2/14)
  • 176.
    184 Study 2 (cont.) •Participants liked: – Correspondence of region size to # documents – Overview (but also wanted zoom) – Ease of jumping from one topic to another – Multiple routes to topics – Use of category and subcategory labels
  • 177.
    185 Study 2 (cont.) •Participants wanted: – hierarchical organization – other ordering of concepts (alphabetical) – integration of browsing and search – correspondence of color to meaning – more meaningful labels – labels at same level of abstraction – fit more labels in the given space – combined keyword and category search – multiple category assignment (sports+entertain)
  • 178.
    186 Study 3: NIRVE NIRVEInterface by Cugini et al. 96. Each rectangle is a cluster. Larger clusters closer to the “pole”. Similar clusters near one another. Opening a cluster causes a projection that shows the titles.
  • 179.
    187 Study 3 Visualization ofsearch results: a comparative evaluation of text, 2D, and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller, Proceedings of SIGIR 99, Berkeley, CA, 1999. • This study compared: – 3D graphical clusters – 2D graphical clusters – textual clusters • 15 participants, between-subject design • Tasks – Locate a particular document – Locate and mark a particular document – Locate a previously marked document – Locate all clusters that discuss some topic – List more frequently represented topics
  • 180.
    188 Study 3 • Results(time to locate targets) – Text clusters fastest – 2D next – 3D last – With practice (6 sessions) 2D neared text results; 3D still slower – Computer experts were just as fast with 3D • Certain tasks equally fast with 2D & text – Find particular cluster – Find an already-marked document • But anything involving text (e.g., find title) much faster with text. – Spatial location rotated, so users lost context • Helpful viz features – Color coding (helped text too) – Relative vertical locations
  • 181.
    189 Summary: Visualizing Clusters •Huge 2D maps may be inappropriate focus for information retrieval – cannot see what the documents are about – space is difficult to browse for IR purposes – (tough to visualize abstract concepts) • Perhaps more suited for pattern discovery and gist-like overviews
  • 182.
    190 IR Infovis Meta-Analysis (Empiricalstudies of information visualization: a meta-analysis, Chen & Yu IJHCS 53(5),2000) • Goal – Find invariant underlying relations suggested collectively by empirical findings from many different studies • Procedure – Examine the literature of empirical infoviz studies • 35 studies between 1991 and 2000 • 27 focused on information retrieval tasks • But due to wide differences in the conduct of the studies and the reporting of statistics, could use only 6 studies
  • 183.
    191 IR Infovis Meta-Analysis (Empiricalstudies of information visualization: a meta-analysis, Chen & Yu IJHCS 53(5),2000) • Conclusions: – IR Infoviz studies not reported in a standard format – Individual cognitive differences had the largest effect • Especially on accuracy • Somewhat on efficiency – Holding cognitive abilities constant, users did better with simpler visual-spatial interfaces – The combined effect of visualization is not statistically significant
  • 184.
    192 So What Works? •Yee, K-P et al., Faceted Metadata for Image Search and Browsing, to appear in CHI 2003. Hearst, M, et al.; Chapter 10 of Modern Information Retrieval, Baeza-Yates & Ribiero-Neto (Eds). • Color highlighting of query terms in results listings • Sorting of search results according to important criteria (date, author) • Grouping of results according to well-organized category labels. – Cha-cha – Flamenco • Only if highly accurate: – Spelling correction/suggestions – Simple relevance feedback (more-like-this) – Certain types of term expansion • Note: most don’t benefit from visualization!
  • 185.
    193 Cha-Cha • Chen, M.,Hearst, M., Hong, J., and Lin, J. Cha-Cha: A System for Organizing Intranet Search Results in the Proceedings of the 2nd USENIX Symposium on Internet Technologies and SYSTEMS (USITS), Boulder, CO, October 11-14, 1999
  • 186.
    194 Teoma: appears tocombine categories and clusters (this version before it was bought by askjeeves)
  • 187.
  • 188.
    196 Cat-a-Cone Marti Hearst andChandu Karadi, Cat-a- Cone: An Interactive Interface for Specifying Searches and Viewing Retrieval Results using a Large Category Hierarchy Proceedings of the 20th Annual International ACM/SIGIR Conference Philadelphia, PA, July 1997
  • 189.
    197 Better to reducethe viz • Flamenco – allows users to steer through the category space • Uses – Dynamically-generated hypertext – Color for distinguishing and grouping – Careful layout and font choices • Focused first on the users’ needs
  • 190.
  • 191.
  • 192.
    200 Slide by Woodruff& Rosenholtz Using Thumbnails to Search the Web A. Woodruff, R. Rosenholtz, J. Morrison, A. Faulring, & P. Pirolli, A comparison on the use of text summaries, plain thumbnails, andenhanced thumbnails for web search tasks. JASIST, 53(2), 172- 185, 2002.; A. Woodruff, A. Faulring, R. Rosenholtz, J. Morrison, & P. Pirolli,Using thumbnails to search the web. SIGCHI 2001 Design Goals – Enhance features that help the user decide whether document is relevant to their query • Emphasize text that is relevant to query – Text callouts • Enlarge (make readable) text that might be helpful in assessing page – Enlarge headers
  • 193.
    201 Slide by Woodruff& Rosenholtz Text and Image Summaries • Text summaries – Lots of abstract, semantic information • Image summaries (plain thumbnails) – Layout, genre information – Gist extraction faster than with text • Benefits are complementary • Create textually-enhanced thumbnails that leverage the advantages of both text summaries and plain thumbnails
  • 194.
    202 Slide by Woodruff& Rosenholtz Putting Callouts in a Separate Visual Layer • Transparency • Occlusion Junctions indicate the occurrence of these events.
  • 195.
    203 Slide by Woodruff& Rosenholtz Design Issues: • Color Management – Problems: Callouts need to be both readable and draw attention – Solution: Desaturate the background image, and use a visual search model to choose appropriate colors – Colors look like those in highlighter pens • Resizing of Text – Problem: We want to make certain text elements readable, but not necessarily draw attention to them – Solution: Modify the HTML before rendering the thumbnail
  • 196.
    204 Slide by Woodruff& Rosenholtz Examples
  • 197.
    205 Slide by Woodruff& Rosenholtz Tasks • Criteria: tasks that… – Are representative of common queries – Have result sets with different characteristics – Vary in the number of correct answers • 4 types of tasks Picture: “Find a picture of a giraffe in the wild.” Homepage: “Find Kern Holoman’s homepage.” Side-effects: “Find at least three side effects of halcion.” E-commerce: “Find an e-commerce site where you can buy a DVD player. Identify the price in dollars.”
  • 198.
    206 Slide by Woodruff& Rosenholtz Conditions • Text summary – Page title – Extracted text with query terms in bold – URL • Plain thumbnail • Enhanced thumbnail – Readable H1, H2 tags – Highlighted callouts of query terms – Reduced contrast level in thumbnail
  • 199.
    207 Slide by Woodruff& Rosenholtz Collections of Summaries • 100 results in random order Approximately same number of each summary type on a page
  • 200.
    208 Slide by Woodruff& Rosenholtz Method • Procedure – 6 practice tasks – 3 questions for each of the 4 task types • e.g., each participant would do one E-commerce question using text, one E-commerce question using plain thumbnails, and one E-commerce question using enhanced thumbnails – Questions blocked by type of summary – WebLogger recorded user actions during browsing – Semi-structured interview • Participants – 12 members of the PARC community Entire process took about 75 minutes 18 questions, with 100 query results each
  • 201.
    209 Slide by Woodruff& Rosenholtz Results • Average total search times, by task: – Picture: 61 secs – Homepage: 80 secs – E-commerce: 64 secs – Side effects: 128 secs • Results pooled across all tasks: – Subjects searched 20 seconds faster with enhanced thumbnails than with plain – Subjects searched 30 seconds faster with enhanced thumbnails than with text summaries – Mean search time overall was 83 seconds
  • 202.
    210 Slide by Woodruff& Rosenholtz Results Normalized total search time (s)
  • 203.
    211 Slide by Woodruff& Rosenholtz Results: User Responses • Participants preferred enhanced thumbnails – 7/12 preferred overall – 5/12 preferred for certain task types • Enhanced thumbnails are intuitive and less work than text or plain thumbnails – One subject said searching for information with text summaries did not seem hard until he used the enhanced thumbnails. • Many participants reported using genre information, cues from the callouts, the relationship between search terms, etc.
  • 204.
    214 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 205.
  • 206.
    216 Comparing 3 CommercialSystems Alfred Kobsa, An Empirical Comparison of Three Commercial Information Visualization Systems, INFOVIS'01.
  • 207.
    217 Comparing 3 CommercialSystems Eureka (InXight)
  • 208.
    218 Comparing 3 CommercialSystems InfoZoom (HumanIT)
  • 209.
  • 210.
    220 Slide by AlfredKobsa Infozoom Overview •Presents data in three different views. •Wide view shows data set in a table format. •Compressed view packs the data set horizontally to fit the window width. •Overview mode has all attributes in ascending or descending order and independent of each other.
  • 211.
  • 212.
    222 Slide by AlfredKobsa InfoZoom Overview View
  • 213.
  • 214.
  • 215.
    225 Slide by KunalGarach •Multidimensional data: three databases were used •Anonymized data from a web based dating service (60 records, 27 variables) •Technical data of cars sold in 1970 – 82 (406 records, 10 variables) •Data on the concentration of heavy metals in Sweden (2298 records, 14 variables) Datasets
  • 216.
    226 Sample Questions • Domore women than men want their partners to have a higher education? • What proportion of the men live in California? • Do all people who think the bar is a good place to meet a mate also believe in love at first site? • Do heavier cars have more horsepower? • Which manufacturer produced the most cars in 1980? • Is there a relationship between the displacement and acceleration of a vehicle?
  • 217.
    227 Slide by KunalGarach Experiment Design • The experimenters generated 26 tasks from all three data sets. • 83 participants. Between-subjects design. •Each was given one visualization system and all three data sets. • Type of visualization system was the independent variable between them. • 30 mins were given to solve the tasks of each data set i.e 26 tasks in 90 mins.
  • 218.
    228 Slide by KunalGarach Overall Results • Mean task completion times: • Infozoom users: 80 secs • Spotfire users: 107 secs • Eureka users: 110 secs • Answer correctness: • Infozoom users: 68% • Spotfire users: 75% • Eureka users: 71% •Not a time-error tradeoff •Spotfire more accurate only 6 questions
  • 219.
    229 Slide by KunalGarach Eureka - problems • Hidden labels: Labels are vertically aligned, max 20 dimensions • 3+ Attributes: Problems with queries involving three or more attributes • Correlation problems: Some participants had trouble answering questions correctly that involved correlations between two attributes.
  • 220.
    230 Slide by KunalGarach Spotfire - problems • Cognitive setup costs: Takes participants considerable time to decide on the right representation and to correctly set the coordinates and parameters. • Biased by scatterplot default: Though powerful, many problems cannot be solved (well) with it.
  • 221.
    231 Slide by KunalGarach Infozoom - problems • Erroneous Correlations • Overview mode has all attributes sorted independent of each other • Narrow row height in compressed view • Participants did not use row expansion and scatterplot charting function which shows correlations more accurately
  • 222.
    232 Geographic Questions • Spotfireshould have done better on these •Which part of the country has the most copper •Is there a relationship between the concentration of vanadin and that of zinc? •Is there a low-level chrome area that is high in vanadim •Spotfire was only better only for the last question (out of 6 geographic ones)
  • 223.
    233 Discussion •Many studies ofthis kind use relatively simple tasks that mirror the strengths of the system •Find the one object with the maximum value for a property •Count how many of certain attributes there are •This study looked at more complex, realistic, and varied questions.
  • 224.
    234 Discussion •Success of avisualization system depends on many factors: • Properties supplied •Spotfire doesn’t visualize as many dimensions simultaneously •Operations •Zooming easy in InfoZoom; allows for drill-down as well •Zooming in Eureka causes context to be lost •Column view in Eureka makes labels hard to see
  • 225.
  • 226.
    236 Slide by CraigRixford Comparing Tree Views • T. Barlow and P. Neville, Comparison of 2D Visualizations of Hierarchies, INFOVIS’01. • Problem – Organization Chart is de facto standard for visualizing decision trees. Is there a better compact view of the tree for the overview window? • Solution – Two usability studies to determine which tree works best.
  • 227.
    237 Goal: Compact Viewof Tools T. Barlow and P. Neville, Comparison of 2D Visualizations of Hierarchies, INFOVIS’01.
  • 228.
    238 Slide by CraigRixford Decision Trees • Each split constitutes a rule or variable in predictive model • Begin Splitting into nodes • Often hundreds of leaves
  • 229.
    239 Slide by CraigRixford Decision Trees – What makes a good visualization • Uses – For novice-helps them understand models – Experts-initial evaluation of decisions tree without looking at models • Criteria for usability in study – Ease of Interpretation of Topology (Parent Child Sibling relations) – Comparison of Node Size – User preference
  • 230.
    240 Slide by CraigRixford Different views examined in study Org Chart Tree Ring Icicle Plot TreeMap
  • 231.
    241 Slide by CraigRixford Usability Test 1: • Users: – 15 colleagues familiar with org chart but not others • Tasks – Is the tree binary or n-ary? – Is the tree balanced or unbalanced? – Find deepest common ancestor of two nodes – Number of levels? – Find three larges leaves (excluding org chart) • Data: Created 8 trees for analysis • Study Design – Randomized order of tasks – 4X5 design (almost) – Timed task from appearance on screen until spacebar tap
  • 232.
    242 Slide by CraigRixford Results • Response Time – TreeMap slowest; no statistical difference between others • Response Accuracy – No significant difference • User Preference – Prefer icicle map and org chart (faster) – Dislike tree map
  • 233.
    243 Slide by CraigRixford Discussion • Org chart served as benchmark • Icicle plot favored amongst others – Hypothesis: Same left to right / top to bottom structure • TreeRing did well • TreeMap suffered from poor accuracy – Offset of rectangles required because of off (which is needed for selection)
  • 234.
    244 Slide by CraigRixford Usability Test II: Tree implementation • Three views: – TreeMap eliminated from this round • Tasks – Node Description • Four versions – select those nodes or leaves that meet certain criteria – Node Analysis: • Memorize a highlighted node – find again after tree redrawn in different position
  • 235.
    245 Slide by CraigRixford Results • Tree rings slower for description but fast and accurate for memory tasks • Perhaps due to unique geometric forms / spatial clues
  • 236.
    246 Slide by CraigRixford Conclusions • TreeMap not useful for this type of task • Org Chart/Icicle seem to be best overall • TreeRing has merits for certain tasks • Icicle chosen for implementation – Best design considering Org Chart could not be used for node size tasks • However: – Didn’t seem to actually do tests on trees as large as the ones they describe as typical of datamining
  • 237.
  • 238.
    248 Slide by MaggieLaw & Vivien Petras Text-Based Chat
  • 239.
    249 Slide by MaggieLaw & Vivien Petras Chat Circles Fernanda Viegas and Judith Donath, Chat Circles, Proceedings of CHI'99.
  • 240.
    250 Slide by MaggieLaw & Vivien Petras Chat Circles • “Chat Circles is a graphical interface for synchronous communication that uses abstract shapes to convey identity and activity.” • Each participant appears as a colored circle, which is accompanied by the user name • Location of circles will also identify participants (important for many users having similar colors associated) • Participants’ circles become larger when posting occurs (circle adapts to text length) • Circle appears bright when posting occurs • Circles of inactive users fade in the background
  • 241.
    251 Slide by MaggieLaw & Vivien Petras Chat Circles – Conversational Groupings • There is only ONE room in Chat Circles • Groupings are achieved by moving closer to other participants • At any time, a participant can view all other participants • A participant can also detect interesting conversations in different areas of the room by looking at how many circles are gathered and how often circles become larger • Overview panel in Chat Circles II nice example of focus + context
  • 242.
    252 Slide by MaggieLaw & Vivien Petras Chat Circles History
  • 243.
    253 Slide by MaggieLaw & Vivien Petras + Easy to see “lurkers” + Sequence and size of messages quickly visible - Not very scalable History Log Patterns
  • 244.
    254 Slide by MaggieLaw & Vivien Petras History Log Patterns +/- User-centric: only 1 point of view represented - Impossible to see all the text at once – requires individual mouse rollovers - Easy to see “out of range” conversations – but why would you want to?
  • 245.
    255 Agenda • Introduction • VisualPrinciples • What Works? • Visualization in Analysis & Problem Solving • Visualizing Documents & Search • Comparing Visualization Techniques • Design Exercise • Wrap-Up
  • 246.
  • 247.
    257 Design Exercise • BreakingStory (Reffel,Fitzpatrick, Ayedelott SIMS final project, at CHI 2003) – Create an application that supplies a visualization for trends over time in web-based news. The primary purpose is to provide an overview, but it should also be possible to view text from individual news sources on specific days. Its goal is to inform, inspire, and enlighten, and also to make people want to look more deeply at the news.
  • 248.
  • 249.
  • 250.
  • 251.
  • 252.
    262 Another Approach: ThemeRiver •S. Havre, B. Hetzler, L. Nowell, "ThemeRiver: Visualizing Theme Changes over Time," Proc. IEEE Symposium on Information Visualization, 2000
  • 253.
  • 254.
    264 Key Questions toAsk about a Viz 1. Is it for analysis or presentation? 2. What does it teach/show/elucidate? 3. What is the key contribution? 4. What are some compelling, useful examples? 5. Could it have been done more simply? 6. Have there been usability studies done? What do they show?
  • 255.
    265 Holistic Design Goalsfor Information Visualization – Tailor to the application and the domain – Create highly interactive and integrated systems – Embed the visualization within a larger application – Provide alternative views
  • 256.
    266 Visualization with aLight Touch: Orbitz.com
  • 257.
    267 Visualization with aLight Touch: Orbitz.com
  • 258.
    268 Visualization with aLight Touch: Orbitz.com
  • 259.
    269 Visualization with aLight Touch: Orbitz.com
  • 260.
    270 Visualization with aLight Touch: Orbitz.com
  • 261.
    271 For more information •My course: • http://www.sims.berkeley.edu/courses/is247/s02/Lectures.html • Atlas of Cyberspaces: • http://www.geog.ucl.ac.uk/casa/martin/atlas/atlas.html • Gallery of Data Visualization; The Best and Worst of Statistical Graphics • http://www.math.yorku.ca/SCS/Gallery/ • Tamara Munzner’s collection: • http://graphics.stanford.edu/courses/cs348c-96-fall/resources.html
  • 262.