From Data to Visualization, what happens in between?
Krist Wongsuphasawat discusses the importance and techniques of data visualization, emphasizing the power of visual representations in uncovering insights and telling stories with data. The document explains different types of data visualizations, approaches to selecting appropriate tools based on data types and desired outcomes, and outlines an ideal workflow for creating visualizations. It highlights the significance of understanding the data first before choosing visualization methods.
Introduction by Krist Wongsuphasawat, Senior Data Visualization Scientist at Twitter.
Overview of Twitter's analytics tools for internal and external visualizations; includes dashboarding and exploratory tools.
Link to Twitter's interactive visualizations website.
Introduction to various examples of data visualizations.
Explains what visualizations are, focusing on their aesthetic and meaningful representation of data.Illustration of Anscombe’s Quartet to show identical statistics and different data distributions visually.
Examples include Napoleon’s March and the London Cholera Outbreak to illustrate data through visualization.
Benefits and usage of visualizations: quick understanding, revealing hidden facts, storytelling, and exploratory analysis.
Terminologies: Information Visualization (InfoVis), Data Visualization (DataVis), and a mention of infographics.
Questions to determine how to start visualizing data, focusing on tools and data types, including dimensions.
Examples of multi-dimensional data visualizations, using flower species data as a reference.
Different examples of visualization techniques such as scatterplots, scatterplot matrices, and trees (from geographical to stock market data).
Explains network visualizations and provides a character co-occurrences example showing nodes and edges.
Multi-dimensional and temporal combinations, including visual encodings, tooltips, animations, and other types.
Criteria to choose visualization tools based on data type and goals, outlining an ideal workflow for visualization.
Describes real-life challenges faced in data visualization workflows: data quality and transformation issues.
Promotion of interactive visualizations through the linked Twitter resource.
Conclusion on the importance of focusing on data first when creating visualization methodologies.
Anscombe’s Quartet
Property Value
Meanof X 11.0
Variance of X 10.0
Mean of Y 7.5
Variance of Y 3.75
Correlation between X and Y 0.816
Linear regression y = 3.0 +0.5x
#1 #2 #3 #4
Identical statistics!
Life Expectancy
(Multi-D +Temporal)
http://www.gapminder.org/videos/the-river-of-myths/
49.
VISUALIZATION
visual encodings +interactions
tooltips
animation
highlight
filter
etc.
bar chart
line chart
matrix
node-link
treemaps
etc.
or multiple views
(data type)
50.
DATA
1) What typeof data?
vis7
vis5
vis3
vis2
vis1
vis6
vis4
Many options...
Which visualization technique should I use?
51.
DATA
1) What typeof data?
vis7
vis3
vis4
Less options...
Still, which one should I use?
52.
How to start?
•What tool should I use?
!
!
!
1. What type of data do I have?
2. What do I want from the data?
DATA
53.
2) What doI want from the data?
• Many ways to visualize one type of data.
• Things to consider:
• audience (data scientist, execs, etc.)
• goal (storytelling, exploratory analysis)
• tasks
State of theUnion
http://twitter.github.io/interactive/sotu2014/#p1
59.
Ok, now tools.
1.What type of data do I have?
2. What do I want from the data?
60.
Tools
Option 1: Programminglibrary
Option 2: Packaged software
You have to write code.
(Mostly) no coding involved
61.
Programming libraries
• d3.js,processing, R, etc.
!
• Copy and modify from examples.
• Can do custom stuffs (if you can figure out how)
• More overhead for common task
62.
Packaged software
• Tableau(multi-dimensional)
• Gephi (graph)
• NodeXL (graph)
• Research projects (contact authors)
!
• Just use the software. No hassle of code/debug
• Limited functionalities to what the tools can do
• Custom designs more difficult
63.
Ideal workflow
1. Whattype of data do I have?
2. What do I want from the data?
3. Pick appropriate techniques/tools
4. Done!
64.
Ideal workflow
1. Whattype of data do I have?
2. What do I want from the data?
3. Pick appropriate techniques/tools
4. Done!
Not that easy!
65.
Real-life workflow
data aredirty unsatisfied
transform
What type of data do I have?
Pre-process data
What do I want from the data?
Pick appropriate techniques/tools
See results change goal
change perspective