information visualisation
         Alan Dix
example
                              Map your moves
where New Yorkers move (10 years data)

distorted map

circle = moves for
         one zip code

red – out
blue – in

overlaid

http://moritz.stefaner.eu/projects/map%20your%20moves/
example
                              Map your moves




interactive:
  selecting a zip code
  shows where
  movements to/from
also hiding:
   what you don’t show
   also important
http://moritz.stefaner.eu/projects/map%20your%20moves/
what is visualistion?

   making data easier to understand
   using direct sensory experience

            especially visual!

but can have aural, tactile ‘visualisation’
direct sensory experience

 N.B. sensory rather than linguisitic

     sort of right/left brain stuff!

but ... may include text, numbers, etc.
visualising in text
        alignment - numbers

think purpose!        532.56
                       179.3
                     256.317
which is biggest?
                          15
                      73.948
                        1035
                       3.142
                    497.6256
visualising in text
         alignment - numbers

visually:                    627.865
 long number = big number      1.005763
                             382.583
                            2502.56
align decimal points
                             432.935
or right align integers        2.0175
                             652.87
                              56.34
visualising in text
                    TableLens
like a ‘spreadsheet’ ...
... but some rows squashed to one pixel high
numbers become small histogram bars
visualising in text
                        TableLens
N.B. also an example of focus+context


       focus
   some rows
   in full detail




      context
   whole dataset
 can also be seen
    in overview
especially visual

 visual cortex is 50% of the brain!

  ... but disability, context, etc.,
may mean non-visual forms needed
why visualisation?
     for the data analyst
scientist, statistician, possibly you!


   for the data consumer
audience, client, reader, end-user
why visualisation?


   understanding
consumer           focus on well
                   understood, simple
                   representations
   rhetoric
why visualisation?
                                                                             to help others see
                                                                             what the analyst
                      understanding                                          has already seen

        consumer                                                                                infographics

                        rhetoric                                                             data journalism




http://www.guardian.co.uk/news/datablog/2010/oct/18/deficit-debt-government-borrowing-data
why visualisation?

                              the business plan
                                 hockey stick!
  understanding
consumer
  rhetoric
     to persuade readers
     of particular point
                           lies, damn lies,
     (and not others!)
                              and graphs
why visualisation?


   understanding
analyst            powerful, often novel
                   visualisations,
                   training possible
   exploration
why visualisation?
                                                                      to make more
                                                                      clear
                    understanding                                     particular aspects
                                                                      of data
         consumer                                                        confirming hypotheses

                                                                                 noticing exceptions
                       exploration


                                                                      e.g. box plots in
graph from: Measurement of the neutrino velocity with the OPERA detector in the CNGS
why visualisation?
                       seeking the unknown

                         avoiding the obvious
  understanding
                             wary of happenstance
consumer
  exploration
     to find new things
     that have not been
     previously considered
a brief history of visualisation

   from 2500 BC to 2012
a brief history ...
static visualisation
     – the first 2500 years

interactive visualisation
     – the glorious ’90s

and now?
     – web and mass data
     – visual analytics
static visualisation
          from clay tablets to Tufte

Mesopotamian tablets
static visualisation
           from clay tablets to Tufte

Mesopotamian tablets
10th Century time line
static visualisation
           from clay tablets to Tufte

Mesopotamian tablets
10th Century time line
1855 Paris-Lyon train timetable
static visualisation
             from clay tablets to Tufte

Mesopotamian tablets
10th Century time line
1855 Paris-Lyon train timetable
Excel etc.
static visualisation
read Tufte’s books ...
  – The Visual Display of Quantitative Information
  – Envisioning Information
  – Visual Explanations
interactive visualisation
early 1990s growing graphics power
  – 3D graphics
  – complex visualisations
  – real-time interaction possible
... and now
loads of data

web visualisation

data journalism




http://www.guardian.co.uk/news/datablog/2010/oct/18/deficit-debt-government-borrowing-data
http://www-958.ibm.com/software/data/cognos/manyeyes/
and visual analytics!
visualisation in context
plain visualisation




data             visualisation
visual analytics




data                 visualisation

        processing
                            direct
                         interaction
the big picture
                      organisational
            action
                     social & political
                         context



world                      ?
                         decision




data                   visualisation

        processing
                               direct
                            interaction
designing visualisation
choosing representations
visualisation factors
  – visual ‘affordances’
       what we can see
    • what we can see
  – objectives, goals and tasks
    • what we need to see
       what we need to see

  – aesthetics
       what we like to see
    • what we like to see
trade-off
    visualisation factors
      – visual affordances
      – objectives, goals and tasks
      – aesthetics

 static representation ⇒ trade-off
 interaction reduces trade-off
interaction reduces trade-off
–    stacking histogram, overview vs. detail, etc. etc.vs.
       –stacking histogram, overview                         detail, etc.
        etc.
relaxing constraints
normal stacked histogram

good for:
   – overall trend
   – relative proportions
   – trend in bottom
     category
                         ?
bad for others
   – what is happening
     to bananas?
make your own (iii)
           relaxing constraints
interactive stacking histograms ...
   or ... dancing histograms
       dancing histograms

normal histogram
normal histogram
  except ...
  except ...
make your own (iii)
           relaxing constraints
interactive stacking histograms ...
   or ... dancing histograms

normal histogram
  except ...
hover over cell
  to show detail
make your own (iii)
          relaxing constraints
interactive stacking histograms ...
   or ... dancing histograms

normal histogram
   except ...
hover over cell
   to reveal detail
click on legend
   to change
   baseline
                                      demonstration
kinds of interaction
highlighting and focus
drill down and hyperlinks
overview and context
changing parameters
changing representations
temporal fusion
Shneiderman’s
                           visualisation mantra
                                   overview first,
                                  zoom and filter,
                              then details on demand
                                                                                              overview
               on demand
                 details                                                                    zoom and filter
                                                                                             using sliders




http://www.sapdesignguild.org/community/book_people/visualization/controls/FilmFinder.htm
classic visualisations
displaying groups/clusters
numeric attributes
   – use average
     or region
categorical attributes
   – show values of attributes common to cluster
text, images, sound
   – no sensible ‘average’ to display
   – use typical documents/images
   – central to cluster ...
     or spread within cluster
using clusters
              the scatter/gather browser
take a collection of documents
scatter:
 – group into fixed number of clusters
 – displays clusters to user
gather:
 – user selects one or
   more clusters
 – system collects
   these together
scatter:
 – system clusters this
   new collection
   ...
displaying clusters
       scatter-gather browser

keywords (created by clustering algorithm)




                                  ‘typical’ documents
                             (with many cluster keywords)
hierarchical data
hierarchies are everywhere!
  – file systems
  – organisation charts
  – taxonomies
  – classification trees
  – ontologies
  – xml
problems with trees ...
hard to fit text labels                         width grows rapidly




                  overlapping low level nodes
use 3D?
cone tree
  – use stacked circles of subtrees
good use of 3D
still have occlusion ...
but ‘normal’ in 3D

shadows help to
disambiguate

but text labels
difficult
cone trees →cam trees
horizontal layout makes labels readable
small things matter!
disect 2D space - treemaps

takes tree of items with some ‘size’
   – e.g. file hierarchy, financial accounts
alternatively divides space horizontally/vertically for
each level, proportionate to total size

                                                x [6]     y [3]
                    x
                        x/a – 4
                                                          y/c [1]
                        x/b – 2                 x/a [4]
                    y                                     y/d [1]
                        y/c – 1
                        y/d – 1                 x/b [2]   y/e [1]
                        y/e – 1

  http://www.cs.umd.edu/hcil/treemap-history/
treemaps (2)
later variants improved the shape and appearance
of maps
treemaps (3)
plus algorithms for vast data sets, for thumbnail
images, etc. etc.
distort space ...
tree branching factor b:
   – number of nodes at depth d = bd
Euclidean 2D space:
   – amount of space at radius r = 2πr
   – not enough space!
non-Euclidean hyperbolic space:
   – exponential space at radius r

hyperbolic browser
   – lays out tree in hyperbolic space
   – then uses 2D representation of hyperbolic space
multiple attributes
often data items have several attributes
e.g. document:
  – type (journal, conference, book)
  – date of publication
  – author(s)
  – multiple keywords (perhaps in taxonomy)
  – citation count
  – popularity
traditional approach ...
           boolean queries
>new query
?type=‘journal’ and keyword=‘visualisation’
=query processing complete - 2175 results
list all (Y/N)
>N
>refine query
 refine: type=‘journal’ and keyword=‘visualisation’
+author=‘smith’
=query processing complete - 0 results
faceted browsing
           e.g. HiBrowse (one of the earliest)

 multiple selection boxes
     – ‘or’ within box      - ‘and’ between boxes


keywords                     authors                    types
digital libraries           all           173         all          173
HCI                 173        catarci      53            book
   formal models                dix           9            conference
   interaction       157        jones        17            journal    173
   task analysis                shneiderman 153            other
   visualisation      39        smith         0
   web                          wilson       22
              (keyword=‘interaction’ or ‘visualisation’) and type=‘journal’
HiBrowse (ii)

 shows how many items with particular value
     – e.g. 39 documents with keyword=‘visualisation’ and type=‘journal’
       e.g. 39 documents with keyword=‘visualisation’ and type=‘journal’




keywords                     authors                     types
digital libraries            all           173         all          173
HCI                 173         catarci      53            book
   formal models                 dix           9            conference
   interaction       157         jones        17            journal    173
   task analysis                 shneiderman 153            other
   visualisation     39          smith         0
   web                           wilson       22
HiBrowse (iii)

 can predict the effect of refining selection
     – e.g. selecting‘smith’ would give empty resultresult
       e.g. selecting ‘smith’ would give empty




keywords                     authors                     types
digital libraries            all           173             all          173
HCI                 173         catarci      53                book
   formal models                 dix           9                conference
   interaction       157         jones        17                journal    173
   task analysis                 shneiderman 153                other
   visualisation     39          smith         0
   web                           wilson       22
HiBrowse (iv)

 refining selection updates counts in real time



keywords                   authors              types
digital libraries         all           173
                                           45
                                           39   all          173
                                                               45
                                                               39
HCI                 173
                      45
                      39      catarci      53
                                           19
                                           18      book         6
   formal models              dix           9
                                            1      conference
   interaction       157      jones        17
                                            5
                                            3      journal    173
                                                               39
   task analysis              shneiderman 153
                                           24
                                           21      other
   visualisation     45
                     39       smith         0
   web                        wilson       22
                                            8
                                            7
starfield (i)
scatter plot for two attributes
colour/shape codes for more
adjust rest with sliders
dots appear/disappear as slider values change

dynamic filtering
starfield (ii)
when few enough points more details appear
Influence Explorer (i)
developed for engineering models
like Starfield ...
but sliders show histogram
how many in category (like HiBrowse)
... and how many ‘just miss’



 red = full match
 black = all but one attribute
 greys = fewer matching attr’s
Influence Explorer (ii)
some versions highlight individual items
in each histogram



similar technique has
been used to match
multiple taxonomic
classifications
Information Scent
Starfield
      shows what is currently selected
     • explore using trial and error

HiBrowse and Influence Explorer
     show what would happen

Pirolli et al. call this Information Scent
  – things in the interface that help you know what
    actions to take to find the information you want
very large datasets
too many points/lines to see

solutions ...

space-filling single-pixel per item
      Keim’s VisD
random selection
      (see Geoff Ellis’ thesis)
clustering
      visualise groups not individuals

Information visualisation

  • 1.
  • 2.
    example Map your moves where New Yorkers move (10 years data) distorted map circle = moves for one zip code red – out blue – in overlaid http://moritz.stefaner.eu/projects/map%20your%20moves/
  • 3.
    example Map your moves interactive: selecting a zip code shows where movements to/from also hiding: what you don’t show also important http://moritz.stefaner.eu/projects/map%20your%20moves/
  • 4.
    what is visualistion? making data easier to understand using direct sensory experience especially visual! but can have aural, tactile ‘visualisation’
  • 5.
    direct sensory experience N.B. sensory rather than linguisitic sort of right/left brain stuff! but ... may include text, numbers, etc.
  • 6.
    visualising in text alignment - numbers think purpose! 532.56 179.3 256.317 which is biggest? 15 73.948 1035 3.142 497.6256
  • 7.
    visualising in text alignment - numbers visually: 627.865 long number = big number 1.005763 382.583 2502.56 align decimal points 432.935 or right align integers 2.0175 652.87 56.34
  • 8.
    visualising in text TableLens like a ‘spreadsheet’ ... ... but some rows squashed to one pixel high numbers become small histogram bars
  • 9.
    visualising in text TableLens N.B. also an example of focus+context focus some rows in full detail context whole dataset can also be seen in overview
  • 10.
    especially visual visualcortex is 50% of the brain! ... but disability, context, etc., may mean non-visual forms needed
  • 11.
    why visualisation? for the data analyst scientist, statistician, possibly you! for the data consumer audience, client, reader, end-user
  • 12.
    why visualisation? understanding consumer focus on well understood, simple representations rhetoric
  • 13.
    why visualisation? to help others see what the analyst understanding has already seen consumer infographics rhetoric data journalism http://www.guardian.co.uk/news/datablog/2010/oct/18/deficit-debt-government-borrowing-data
  • 14.
    why visualisation? the business plan hockey stick! understanding consumer rhetoric to persuade readers of particular point lies, damn lies, (and not others!) and graphs
  • 15.
    why visualisation? understanding analyst powerful, often novel visualisations, training possible exploration
  • 16.
    why visualisation? to make more clear understanding particular aspects of data consumer confirming hypotheses noticing exceptions exploration e.g. box plots in graph from: Measurement of the neutrino velocity with the OPERA detector in the CNGS
  • 17.
    why visualisation? seeking the unknown avoiding the obvious understanding wary of happenstance consumer exploration to find new things that have not been previously considered
  • 18.
    a brief historyof visualisation from 2500 BC to 2012
  • 19.
    a brief history... static visualisation – the first 2500 years interactive visualisation – the glorious ’90s and now? – web and mass data – visual analytics
  • 20.
    static visualisation from clay tablets to Tufte Mesopotamian tablets
  • 21.
    static visualisation from clay tablets to Tufte Mesopotamian tablets 10th Century time line
  • 22.
    static visualisation from clay tablets to Tufte Mesopotamian tablets 10th Century time line 1855 Paris-Lyon train timetable
  • 24.
    static visualisation from clay tablets to Tufte Mesopotamian tablets 10th Century time line 1855 Paris-Lyon train timetable Excel etc.
  • 25.
    static visualisation read Tufte’sbooks ... – The Visual Display of Quantitative Information – Envisioning Information – Visual Explanations
  • 26.
    interactive visualisation early 1990sgrowing graphics power – 3D graphics – complex visualisations – real-time interaction possible
  • 27.
    ... and now loadsof data web visualisation data journalism http://www.guardian.co.uk/news/datablog/2010/oct/18/deficit-debt-government-borrowing-data http://www-958.ibm.com/software/data/cognos/manyeyes/
  • 28.
  • 29.
  • 30.
  • 31.
    visual analytics data visualisation processing direct interaction
  • 32.
    the big picture organisational action social & political context world ? decision data visualisation processing direct interaction
  • 33.
  • 34.
    choosing representations visualisation factors – visual ‘affordances’ what we can see • what we can see – objectives, goals and tasks • what we need to see what we need to see – aesthetics what we like to see • what we like to see
  • 35.
    trade-off visualisation factors – visual affordances – objectives, goals and tasks – aesthetics static representation ⇒ trade-off interaction reduces trade-off interaction reduces trade-off – stacking histogram, overview vs. detail, etc. etc.vs. –stacking histogram, overview detail, etc. etc.
  • 36.
    relaxing constraints normal stackedhistogram good for: – overall trend – relative proportions – trend in bottom category ? bad for others – what is happening to bananas?
  • 37.
    make your own(iii) relaxing constraints interactive stacking histograms ... or ... dancing histograms dancing histograms normal histogram normal histogram except ... except ...
  • 38.
    make your own(iii) relaxing constraints interactive stacking histograms ... or ... dancing histograms normal histogram except ... hover over cell to show detail
  • 39.
    make your own(iii) relaxing constraints interactive stacking histograms ... or ... dancing histograms normal histogram except ... hover over cell to reveal detail click on legend to change baseline demonstration
  • 40.
    kinds of interaction highlightingand focus drill down and hyperlinks overview and context changing parameters changing representations temporal fusion
  • 41.
    Shneiderman’s visualisation mantra overview first, zoom and filter, then details on demand overview on demand details zoom and filter using sliders http://www.sapdesignguild.org/community/book_people/visualization/controls/FilmFinder.htm
  • 42.
  • 43.
    displaying groups/clusters numeric attributes – use average or region categorical attributes – show values of attributes common to cluster text, images, sound – no sensible ‘average’ to display – use typical documents/images – central to cluster ... or spread within cluster
  • 44.
    using clusters the scatter/gather browser take a collection of documents scatter: – group into fixed number of clusters – displays clusters to user gather: – user selects one or more clusters – system collects these together scatter: – system clusters this new collection ...
  • 45.
    displaying clusters scatter-gather browser keywords (created by clustering algorithm) ‘typical’ documents (with many cluster keywords)
  • 46.
    hierarchical data hierarchies areeverywhere! – file systems – organisation charts – taxonomies – classification trees – ontologies – xml
  • 47.
    problems with trees... hard to fit text labels width grows rapidly overlapping low level nodes
  • 48.
    use 3D? cone tree – use stacked circles of subtrees
  • 49.
    good use of3D still have occlusion ... but ‘normal’ in 3D shadows help to disambiguate but text labels difficult
  • 50.
    cone trees →camtrees horizontal layout makes labels readable small things matter!
  • 51.
    disect 2D space- treemaps takes tree of items with some ‘size’ – e.g. file hierarchy, financial accounts alternatively divides space horizontally/vertically for each level, proportionate to total size x [6] y [3] x x/a – 4 y/c [1] x/b – 2 x/a [4] y y/d [1] y/c – 1 y/d – 1 x/b [2] y/e [1] y/e – 1 http://www.cs.umd.edu/hcil/treemap-history/
  • 52.
    treemaps (2) later variantsimproved the shape and appearance of maps
  • 53.
    treemaps (3) plus algorithmsfor vast data sets, for thumbnail images, etc. etc.
  • 54.
    distort space ... treebranching factor b: – number of nodes at depth d = bd Euclidean 2D space: – amount of space at radius r = 2πr – not enough space! non-Euclidean hyperbolic space: – exponential space at radius r hyperbolic browser – lays out tree in hyperbolic space – then uses 2D representation of hyperbolic space
  • 55.
    multiple attributes often dataitems have several attributes e.g. document: – type (journal, conference, book) – date of publication – author(s) – multiple keywords (perhaps in taxonomy) – citation count – popularity
  • 56.
    traditional approach ... boolean queries >new query ?type=‘journal’ and keyword=‘visualisation’ =query processing complete - 2175 results list all (Y/N) >N >refine query refine: type=‘journal’ and keyword=‘visualisation’ +author=‘smith’ =query processing complete - 0 results
  • 57.
    faceted browsing e.g. HiBrowse (one of the earliest) multiple selection boxes – ‘or’ within box - ‘and’ between boxes keywords authors types digital libraries all 173 all 173 HCI 173 catarci 53 book formal models dix 9 conference interaction 157 jones 17 journal 173 task analysis shneiderman 153 other visualisation 39 smith 0 web wilson 22 (keyword=‘interaction’ or ‘visualisation’) and type=‘journal’
  • 58.
    HiBrowse (ii) showshow many items with particular value – e.g. 39 documents with keyword=‘visualisation’ and type=‘journal’ e.g. 39 documents with keyword=‘visualisation’ and type=‘journal’ keywords authors types digital libraries all 173 all 173 HCI 173 catarci 53 book formal models dix 9 conference interaction 157 jones 17 journal 173 task analysis shneiderman 153 other visualisation 39 smith 0 web wilson 22
  • 59.
    HiBrowse (iii) canpredict the effect of refining selection – e.g. selecting‘smith’ would give empty resultresult e.g. selecting ‘smith’ would give empty keywords authors types digital libraries all 173 all 173 HCI 173 catarci 53 book formal models dix 9 conference interaction 157 jones 17 journal 173 task analysis shneiderman 153 other visualisation 39 smith 0 web wilson 22
  • 60.
    HiBrowse (iv) refiningselection updates counts in real time keywords authors types digital libraries all 173 45 39 all 173 45 39 HCI 173 45 39 catarci 53 19 18 book 6 formal models dix 9 1 conference interaction 157 jones 17 5 3 journal 173 39 task analysis shneiderman 153 24 21 other visualisation 45 39 smith 0 web wilson 22 8 7
  • 61.
    starfield (i) scatter plotfor two attributes colour/shape codes for more adjust rest with sliders dots appear/disappear as slider values change dynamic filtering
  • 62.
    starfield (ii) when fewenough points more details appear
  • 63.
    Influence Explorer (i) developedfor engineering models like Starfield ... but sliders show histogram how many in category (like HiBrowse) ... and how many ‘just miss’ red = full match black = all but one attribute greys = fewer matching attr’s
  • 64.
    Influence Explorer (ii) someversions highlight individual items in each histogram similar technique has been used to match multiple taxonomic classifications
  • 65.
    Information Scent Starfield shows what is currently selected • explore using trial and error HiBrowse and Influence Explorer show what would happen Pirolli et al. call this Information Scent – things in the interface that help you know what actions to take to find the information you want
  • 66.
    very large datasets toomany points/lines to see solutions ... space-filling single-pixel per item Keim’s VisD random selection (see Geoff Ellis’ thesis) clustering visualise groups not individuals