SlideShare a Scribd company logo
6
Most read
7
Most read
8
Most read
Introduction to
Data Visualization
TURNING DATA INTO INSIGHTFUL STORIES
RAVI PRAKASH JHA
BIOSTATISTICS FACULT Y
DEPARTMENT OF COMMUNIT Y MEDICINE
DR BSA MEDICAL COLLEGE AND HOSPITAL
"The greatest value of a picture is
when it forces us to notice what
we never expected to see."
- JOHN TUKEY, STATISTICIAN AND DATA VISUALIZATION PIONEER
Outline
Effective data visualization
transforms numbers into a
story, revealing patterns and
insights that words alone
cannot convey.
 Introduction
 Learning Objectives
 Data Visualization: The Art of Telling Stories with Data
 Why Data Visualization Matters? See the Big Picture
 Brief History of Data Visualization
 The Building Blocks of Great Visualizations
 Exploring Chart Types
 Choosing the Perfect Visualization
 The Golden Rules of Data Visualization
 When Visualization Fails? Common Pitfalls to Avoid
 The Best Tools for Data Storytelling
 Interactive Visuals: Bringing Data to Life
 Advanced Visualization Techniques
 Hands-On Exercise
 Key Takeaways for the Next Talk
Introduction
In a world driven by data, numbers alone are not enough.
The key lies in transforming raw data into compelling visual stories.
Data visualization is the graphical representation of information to
communicate insights clearly and effectively.
Data visualization is about making large datasets coherent. It is a
visual language for describing, exploring, analyzing and
summarizing data.
Data visualization brings clarity, precision, and efficiency in
communicating data.
Data
Visualization
Describe
Explore
Summarize
Analyse
Learning Objectives
• Visualizations simplify complex datasets.
• They highlight patterns and trends not obvious in raw data.
Understand the concept and
importance of data visualization
• Bar charts: Compare categories (e.g., sales by region).
• Line graphs: Show trends over time (e.g., monthly revenue).
• Scatter plots: Display relationships between variables (e.g., age vs. income).
Learn how to select appropriate
charts for different datasets
• Tools: Excel (basic), R (advanced, customizable), Tableau (interactive dashboards).
Explore tools and techniques for
creating effective visualizations
• Misleading axes or scales can distort data interpretation.
• Overusing colors or cluttering visuals reduces clarity.
Identify common mistakes and
how to avoid them.
Data Visualization:
The Art of Telling Stories with Data
What Makes
Data
Visualization
an Art?
Data visualization is
more than just charts
and graphs
It’s about crafting
narratives that
resonate with your
audience.
The Goal To turn numbers into
a compelling story
One that is clear,
insightful
The story that drives
action.
Key Elements
of a Data
Story:
Characters: The data
points, patterns, and
outliers that form the
core of the narrative.
Plot: The journey from
raw data to
meaningful insights.
Resolution: The
takeaway or decision
enabled by the
visualization.
New
Perspective:
"Good data
visualizations inform,
but great
visualizations inspire
action.
They bridge the gap
between analysis and
understanding,
engaging both logic
and emotion.
Raw Data
Visualization
Insight
Action
Why Data Visualization Matters?
See the Big Picture
Drives Decision Making
Effective Communication
Reveals Hidden Pattern
Complexity
Applicatio
n
Social
Media
Science
Healthcare
Business
Brief History of Data Visualization
William Playfair, in his book “The
Commercial and Political Atlas” (1786)
presented a variety of graphs. Example:
Portrayed exports from England with
imports into England from Denmark and
Norway from 1700 to 1780.
Physician John Snow (1854-55) plotted the
locations of cholera deaths on a map.
The Building Blocks of Great
Visualizations
Define the
purpose
Understand the
dataset
Choose the
chart type
Visual Encoding
and Designing
(Titles, Labels,
Axes,
Positioning,
Size, Shape,
Colour)
Interactivity
(Zooming,
Details on click,
Features of
Dashboards)
Adjust till the
desired
representation
is achieved
Exploring Some Chart Types
Bar Chart
• Use for comparing categories (e.g., Sales by region).
Line Graph
• Show trends over time (e.g., Monthly revenue growth).
Scatter Plot
• Visualize relationships between two numerical variables (e.g., Age vs. Height).
Pie Chart
• Show proportions of a whole (e.g., Market share distribution of companies in a
sector).
Heatmap
• Show intensity of values using color (e.g., Population density across regions).
Choosing the Perfect Visualization
Chart Type Suitable Data Type Advantages Limitations
Bar Chart
Categorical, Discrete
Numeric
- Easy to compare categories. Visually
simple and clear. Effective for small-to-
medium datasets.
- Not suitable for time-series data. Clutters with too many
categories.
Line Graph
Continuous Numeric,
Time-Series
- Ideal for showing trends over time. Clearly
shows upward or downward movements.
- Ineffective for categorical comparisons. Requires clear
time intervals for accuracy.
Pie Chart Categorical
- Good for showing proportions. Effective for
small datasets with few categories.
- Difficult to interpret with too many slices. Precise
comparisons are challenging.
Scatter Plot Continuous Numeric
- Highlights relationships between two
variables. Identifies outliers easily.
- Difficult to interpret with overlapping points in large
datasets. Does not show trends.
Heatmap
Continuous, Matrix-like
Data
- Visualizes density or magnitude with color
gradients. Effective for large datasets.
- Can be visually overwhelming. Requires careful choice
of color schemes.
Histogram Continuous Numeric
- Shows distribution of a single variable.
Highlights skewness and spread of data.
- Does not compare categories. Interval size can influence
interpretation.
Bubble Chart Categorical, Continuous
- Adds an extra dimension with bubble size.
Good for showing relationships and
magnitudes.
- Can become cluttered with too many bubbles. Difficult to
interpret for small values.
Box Plot Continuous Numeric
- Summarizes data distribution (median,
quartiles, outliers). Effective for group
comparisons.
- Limited to summary statistics. Does not show detailed
frequency distribution.
Pie Chart: Distribution of TB Cases in a State Year-wise Percentage of TB cases in Different Districts of Maharashtra
Gender-wise Percentage of TB Mortality in Different Districts of Maharashtra
The Golden Rules of Data Visualization
Know Your
Audience:
Tailor visuals
to their
knowledge
level and
needs.
Define Your
Objective: Be
clear about
the story you
want to tell.
Simplify the
Design: Avoid
clutter; keep
visuals clean
and
straightforwa
rd.
Choose the
Right Chart:
Match chart
type to data
and
message.
Use Accurate
Scales:
Ensure axes
and data
scaling
reflect the
truth.
Highlight Key
Insights:
Draw
attention to
critical points
using color or
annotations.
Prioritize
Readability:
Use clear
fonts, labels,
and
sufficient
contrast.
Use Color
Wisely: Limit
colors and
maintain
consistency
in your
palette.
Test for
Interpretabilit
y: Validate
that your
audience can
understand
the
visualization.
Respect
Ethical
Guidelines:
Be
transparent
and avoid
misleading
data
representatio
ns.
When Visualization Fails? Common Pitfalls to Avoid
Cluttered Design: Overloading visuals with too much information.
Misleading Axes: Manipulating scales or truncating axes to distort data.
Wrong Chart Type: Using charts that don't suit the data or the message.
Poor Color Choices: Overusing colors or choosing low-contrast palettes.
Lack of Context: Failing to provide labels, legends, or explanations.
Overcomplication: Adding unnecessary 3D effects or decorative elements.
Data Overload: Showing too much raw data instead of summarizing insights.
Ignoring Audience Needs: Creating visuals that are too technical or simplistic.
Inconsistent Style: Using mismatched fonts, colors, or themes.
Ethical Misrepresentation: Cherry-picking data or omitting key information.
The Best Tools for Data Storytelling
The best tool for data storytelling is one that aligns with your needs and empowers your audience to see the story within the numbers.
Tool Use Case Advantages Limitations
Excel
Quick visualizations and
dashboards
- Simple and easy to use. Good for
small datasets. Pivot tables and
conditional formatting.
- Limited scalability for large datasets. Basic customization.
R
Advanced analytics and
custom visualizations
- Highly customizable. Powerful
statistical tools. Libraries like
ggplot2, shiny, plotly.
- Requires programming skills. Steep learning curve for non-technical
users.
Python
Integrated data analysis
and storytelling
- Versatile with libraries like
Matplotlib, Seaborn, Plotly. AI and ML
integration.
- Requires programming expertise. Longer setup time for complex tasks.
Tableau
Business intelligence
and interactive
dashboards
- User-friendly drag-and-drop
interface. Real-time updates.
Storyboarding capability.
- High cost for licenses. Limited in handling advanced statistical
calculations.
Power BI
Enterprise reporting and
collaboration
- Affordable for Microsoft users.
Integration with Excel and Azure.
Easy sharing options.
- Less flexibility compared to R or Python. Requires MS ecosystem for full
power.
ArcGIS/QGIS
Geospatial data
visualization
- Excellent for mapping and
geospatial analysis. Wide array of
GIS tools.
- Specialized knowledge required. Can be resource-intensive.
Canva/PiktochartInfographic creation
- Easy and visually appealing outputs.
Ideal for presentations.
- Limited analytical capabilities. Not suitable for complex datasets.
SPSS/Stata
Statistical analysis with
basic visuals
- Specialized for statistical reporting.
Easy for academic and research use.
- Limited graphics options compared to modern visualization tools.
Interactive Visuals: Bringing Data to Life
• Use: Interactive scatter plots, line graphs, and
dashboards for web applications.
• Features: Highly customizable and web-ready.
Plotly
(Python/R)
• Use: Build web applications for data exploration
and interactive analysis.
• Features: Fully customizable UI with seamless R
integration.
Shiny (R)
Advanced Visualization
Techniques
"ggplot2", # Advanced visualization
"lattice", # Trellis graphics
"dplyr", # Data manipulation
"tidyr", # Data wrangling
"patchwork", # Combining ggplot objects
"ggthemes", # Themes for ggplot
"gridExtra", # Arranging multiple plots
"reshape2", # Reshape data for plotting
"corrplot", # Correlation plots
"grid", # Basic grid graphics
"scales", # Scaling in ggplot
"vioplot",# Violin plots
"ggforce", # Additional ggplot2 features
"car", # Companion to Applied Regression
"tmap", # Thematic maps
"sf", # Spatial data handling
"plotly", # Interactive plots
"ggpubr“ # Publication-ready plots
Hands-On Exercise : Datasets Used
mtcars
Description: A
dataset of fuel
consumption and
10 aspects of
automobile design
for 32 cars.
Variables: mpg
(Miles per gallon),
wt (Weight), cyl
(Cylinders), hp
(Horsepower), etc.
• Usage: Scatter
plots, correlation
matrices, and
bar charts.
mtcars_cor
Description:
Correlation matrix
derived from
mtcars.
Variables: Pairwise
correlations
between all
numeric columns
in mtcars.
• Usage:
Heatmaps,
correlation plots.
iris
Description: A
dataset of 150
observations on
iris flowers, with
measurements for
sepal and petal
length/width.
Variables:
Sepal.Length,
Sepal.Width,
Species (Setosa,
Versicolor,
Virginica).
• Usage: Trellis
plots, bar plots
with error bars.
InsectSprays
Description: Data
from an
agricultural
experiment
measuring the
effectiveness of
insecticides.
Variables: count
(Insect count),
spray (Spray type,
A-F).
• Usage: Violin
plots.
Word
Description: A
spatial dataset
containing
country-level
attributes,
including
population and life
expectancy.
Variables: name,
population,
life_exp (Life
expectancy),
geometry (Spatial
polygons).
• Usage: Thematic
maps.
Synthetic
Datasets
• Description:
simple random
or grouped data
was created
manually.
• Variables: Can be
customized.
• Usage: Quick
custom
visualization.
Key Takeaways for the Next Talk
What’s Next?
• Detailed hands-on session using ggplot2
• Building layered plots with ggplot2.
• Customizing themes and aesthetics.
• Exploring advanced visualizations (e.g., faceted plots, annotations).
Preparation for the Next Talk:
• Install R, R Studio and the ggplot2 package if not already done.
Bring a dataset you'd like to visualize for the hands-on practice.

More Related Content

PDF
DATA VISUALIZATION
Aabhika Samantaray
 
PPTX
Data-Visualization an Introduction of statistics
hethetdarji
 
PPT
Visual Analytics in Big Data
Saurabh Shanbhag
 
PDF
decision support system lecture five for business
Minie Me
 
PDF
Design for Delight
Amanda Makulec
 
PDF
The Data Stroytelling Handbook
ssuserd075da
 
PPTX
Business Anaytics lecture notes1.docx (2).pptx
kiruthikan18
 
PDF
Rhetorics of Data, Narrative, and Visualization
National Information Standards Organization (NISO)
 
DATA VISUALIZATION
Aabhika Samantaray
 
Data-Visualization an Introduction of statistics
hethetdarji
 
Visual Analytics in Big Data
Saurabh Shanbhag
 
decision support system lecture five for business
Minie Me
 
Design for Delight
Amanda Makulec
 
The Data Stroytelling Handbook
ssuserd075da
 
Business Anaytics lecture notes1.docx (2).pptx
kiruthikan18
 
Rhetorics of Data, Narrative, and Visualization
National Information Standards Organization (NISO)
 

Similar to Introduction to Data Visualization_Day 1.pptx (20)

PPTX
Data Visualization - Presentation at Microsoft IT Pro Mumbai July 2010
Dhiren Gala
 
PDF
Data Visualisation Top 5 Techniques And Tools.pdf
DataSpace Academy
 
PDF
_data_visualization.pdf important presentation
rashidalibajkani
 
PPTX
Unit III.pptx
KennyPratheepKumar
 
PDF
What is data visualization
intellect808
 
PDF
Visualization topic of big data analytics
deeparsh4616
 
PPTX
Data visualization of Big Data analytics
nandini patil
 
PPTX
Lesson 6- Data Visualization and Reporting.pptx
1045858
 
PPT
Diagramatic and graphical representation of data Notes on Statistics.ppt
aigil2
 
PDF
Data-Visualization-101-Telling-Stories-with-Data
Ozias Rondon
 
PPTX
Making abstract data visible
Priyanshi Jain
 
PPTX
Data visualisation
86HRANKITGUPTA
 
PPTX
Datamining data visualization
Asterite
 
PPTX
The Gauge & Widget Advantage for your Dashboard
FusionCharts
 
PPTX
Extension presentation
alkapant123
 
PPTX
Data visualisations quality aspects
Antonio De Marinis
 
PPTX
The Top 10 Glasstable Design Principles to Boost Your Career and Your Business
Splunk
 
PDF
Basic Data Visualization Techniques for Beginners.pdf
jashwanthmuthumula
 
PDF
Data Visualization Techniques for Beginners
prasathsankar7
 
PPTX
Data Visualization1.pptx
qwtadhsaber
 
Data Visualization - Presentation at Microsoft IT Pro Mumbai July 2010
Dhiren Gala
 
Data Visualisation Top 5 Techniques And Tools.pdf
DataSpace Academy
 
_data_visualization.pdf important presentation
rashidalibajkani
 
Unit III.pptx
KennyPratheepKumar
 
What is data visualization
intellect808
 
Visualization topic of big data analytics
deeparsh4616
 
Data visualization of Big Data analytics
nandini patil
 
Lesson 6- Data Visualization and Reporting.pptx
1045858
 
Diagramatic and graphical representation of data Notes on Statistics.ppt
aigil2
 
Data-Visualization-101-Telling-Stories-with-Data
Ozias Rondon
 
Making abstract data visible
Priyanshi Jain
 
Data visualisation
86HRANKITGUPTA
 
Datamining data visualization
Asterite
 
The Gauge & Widget Advantage for your Dashboard
FusionCharts
 
Extension presentation
alkapant123
 
Data visualisations quality aspects
Antonio De Marinis
 
The Top 10 Glasstable Design Principles to Boost Your Career and Your Business
Splunk
 
Basic Data Visualization Techniques for Beginners.pdf
jashwanthmuthumula
 
Data Visualization Techniques for Beginners
prasathsankar7
 
Data Visualization1.pptx
qwtadhsaber
 
Ad

Recently uploaded (20)

PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PPTX
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
PDF
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PPTX
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PPT
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
PDF
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
PPTX
Presentation on animal welfare a good topic
kidscream385
 
PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PDF
Fundamentals and Techniques of Biophysics and Molecular Biology (Pranav Kumar...
RohitKumar868624
 
short term internship project on Data visualization
JMJCollegeComputerde
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
Presentation on animal welfare a good topic
kidscream385
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Fundamentals and Techniques of Biophysics and Molecular Biology (Pranav Kumar...
RohitKumar868624
 
Ad

Introduction to Data Visualization_Day 1.pptx

  • 1. Introduction to Data Visualization TURNING DATA INTO INSIGHTFUL STORIES RAVI PRAKASH JHA BIOSTATISTICS FACULT Y DEPARTMENT OF COMMUNIT Y MEDICINE DR BSA MEDICAL COLLEGE AND HOSPITAL
  • 2. "The greatest value of a picture is when it forces us to notice what we never expected to see." - JOHN TUKEY, STATISTICIAN AND DATA VISUALIZATION PIONEER
  • 3. Outline Effective data visualization transforms numbers into a story, revealing patterns and insights that words alone cannot convey.  Introduction  Learning Objectives  Data Visualization: The Art of Telling Stories with Data  Why Data Visualization Matters? See the Big Picture  Brief History of Data Visualization  The Building Blocks of Great Visualizations  Exploring Chart Types  Choosing the Perfect Visualization  The Golden Rules of Data Visualization  When Visualization Fails? Common Pitfalls to Avoid  The Best Tools for Data Storytelling  Interactive Visuals: Bringing Data to Life  Advanced Visualization Techniques  Hands-On Exercise  Key Takeaways for the Next Talk
  • 4. Introduction In a world driven by data, numbers alone are not enough. The key lies in transforming raw data into compelling visual stories. Data visualization is the graphical representation of information to communicate insights clearly and effectively. Data visualization is about making large datasets coherent. It is a visual language for describing, exploring, analyzing and summarizing data. Data visualization brings clarity, precision, and efficiency in communicating data. Data Visualization Describe Explore Summarize Analyse
  • 5. Learning Objectives • Visualizations simplify complex datasets. • They highlight patterns and trends not obvious in raw data. Understand the concept and importance of data visualization • Bar charts: Compare categories (e.g., sales by region). • Line graphs: Show trends over time (e.g., monthly revenue). • Scatter plots: Display relationships between variables (e.g., age vs. income). Learn how to select appropriate charts for different datasets • Tools: Excel (basic), R (advanced, customizable), Tableau (interactive dashboards). Explore tools and techniques for creating effective visualizations • Misleading axes or scales can distort data interpretation. • Overusing colors or cluttering visuals reduces clarity. Identify common mistakes and how to avoid them.
  • 6. Data Visualization: The Art of Telling Stories with Data What Makes Data Visualization an Art? Data visualization is more than just charts and graphs It’s about crafting narratives that resonate with your audience. The Goal To turn numbers into a compelling story One that is clear, insightful The story that drives action. Key Elements of a Data Story: Characters: The data points, patterns, and outliers that form the core of the narrative. Plot: The journey from raw data to meaningful insights. Resolution: The takeaway or decision enabled by the visualization. New Perspective: "Good data visualizations inform, but great visualizations inspire action. They bridge the gap between analysis and understanding, engaging both logic and emotion. Raw Data Visualization Insight Action
  • 7. Why Data Visualization Matters? See the Big Picture Drives Decision Making Effective Communication Reveals Hidden Pattern Complexity Applicatio n Social Media Science Healthcare Business
  • 8. Brief History of Data Visualization William Playfair, in his book “The Commercial and Political Atlas” (1786) presented a variety of graphs. Example: Portrayed exports from England with imports into England from Denmark and Norway from 1700 to 1780. Physician John Snow (1854-55) plotted the locations of cholera deaths on a map.
  • 9. The Building Blocks of Great Visualizations Define the purpose Understand the dataset Choose the chart type Visual Encoding and Designing (Titles, Labels, Axes, Positioning, Size, Shape, Colour) Interactivity (Zooming, Details on click, Features of Dashboards) Adjust till the desired representation is achieved
  • 10. Exploring Some Chart Types Bar Chart • Use for comparing categories (e.g., Sales by region). Line Graph • Show trends over time (e.g., Monthly revenue growth). Scatter Plot • Visualize relationships between two numerical variables (e.g., Age vs. Height). Pie Chart • Show proportions of a whole (e.g., Market share distribution of companies in a sector). Heatmap • Show intensity of values using color (e.g., Population density across regions).
  • 11. Choosing the Perfect Visualization Chart Type Suitable Data Type Advantages Limitations Bar Chart Categorical, Discrete Numeric - Easy to compare categories. Visually simple and clear. Effective for small-to- medium datasets. - Not suitable for time-series data. Clutters with too many categories. Line Graph Continuous Numeric, Time-Series - Ideal for showing trends over time. Clearly shows upward or downward movements. - Ineffective for categorical comparisons. Requires clear time intervals for accuracy. Pie Chart Categorical - Good for showing proportions. Effective for small datasets with few categories. - Difficult to interpret with too many slices. Precise comparisons are challenging. Scatter Plot Continuous Numeric - Highlights relationships between two variables. Identifies outliers easily. - Difficult to interpret with overlapping points in large datasets. Does not show trends. Heatmap Continuous, Matrix-like Data - Visualizes density or magnitude with color gradients. Effective for large datasets. - Can be visually overwhelming. Requires careful choice of color schemes. Histogram Continuous Numeric - Shows distribution of a single variable. Highlights skewness and spread of data. - Does not compare categories. Interval size can influence interpretation. Bubble Chart Categorical, Continuous - Adds an extra dimension with bubble size. Good for showing relationships and magnitudes. - Can become cluttered with too many bubbles. Difficult to interpret for small values. Box Plot Continuous Numeric - Summarizes data distribution (median, quartiles, outliers). Effective for group comparisons. - Limited to summary statistics. Does not show detailed frequency distribution.
  • 12. Pie Chart: Distribution of TB Cases in a State Year-wise Percentage of TB cases in Different Districts of Maharashtra Gender-wise Percentage of TB Mortality in Different Districts of Maharashtra
  • 13. The Golden Rules of Data Visualization Know Your Audience: Tailor visuals to their knowledge level and needs. Define Your Objective: Be clear about the story you want to tell. Simplify the Design: Avoid clutter; keep visuals clean and straightforwa rd. Choose the Right Chart: Match chart type to data and message. Use Accurate Scales: Ensure axes and data scaling reflect the truth. Highlight Key Insights: Draw attention to critical points using color or annotations. Prioritize Readability: Use clear fonts, labels, and sufficient contrast. Use Color Wisely: Limit colors and maintain consistency in your palette. Test for Interpretabilit y: Validate that your audience can understand the visualization. Respect Ethical Guidelines: Be transparent and avoid misleading data representatio ns.
  • 14. When Visualization Fails? Common Pitfalls to Avoid Cluttered Design: Overloading visuals with too much information. Misleading Axes: Manipulating scales or truncating axes to distort data. Wrong Chart Type: Using charts that don't suit the data or the message. Poor Color Choices: Overusing colors or choosing low-contrast palettes. Lack of Context: Failing to provide labels, legends, or explanations. Overcomplication: Adding unnecessary 3D effects or decorative elements. Data Overload: Showing too much raw data instead of summarizing insights. Ignoring Audience Needs: Creating visuals that are too technical or simplistic. Inconsistent Style: Using mismatched fonts, colors, or themes. Ethical Misrepresentation: Cherry-picking data or omitting key information.
  • 15. The Best Tools for Data Storytelling The best tool for data storytelling is one that aligns with your needs and empowers your audience to see the story within the numbers. Tool Use Case Advantages Limitations Excel Quick visualizations and dashboards - Simple and easy to use. Good for small datasets. Pivot tables and conditional formatting. - Limited scalability for large datasets. Basic customization. R Advanced analytics and custom visualizations - Highly customizable. Powerful statistical tools. Libraries like ggplot2, shiny, plotly. - Requires programming skills. Steep learning curve for non-technical users. Python Integrated data analysis and storytelling - Versatile with libraries like Matplotlib, Seaborn, Plotly. AI and ML integration. - Requires programming expertise. Longer setup time for complex tasks. Tableau Business intelligence and interactive dashboards - User-friendly drag-and-drop interface. Real-time updates. Storyboarding capability. - High cost for licenses. Limited in handling advanced statistical calculations. Power BI Enterprise reporting and collaboration - Affordable for Microsoft users. Integration with Excel and Azure. Easy sharing options. - Less flexibility compared to R or Python. Requires MS ecosystem for full power. ArcGIS/QGIS Geospatial data visualization - Excellent for mapping and geospatial analysis. Wide array of GIS tools. - Specialized knowledge required. Can be resource-intensive. Canva/PiktochartInfographic creation - Easy and visually appealing outputs. Ideal for presentations. - Limited analytical capabilities. Not suitable for complex datasets. SPSS/Stata Statistical analysis with basic visuals - Specialized for statistical reporting. Easy for academic and research use. - Limited graphics options compared to modern visualization tools.
  • 16. Interactive Visuals: Bringing Data to Life • Use: Interactive scatter plots, line graphs, and dashboards for web applications. • Features: Highly customizable and web-ready. Plotly (Python/R) • Use: Build web applications for data exploration and interactive analysis. • Features: Fully customizable UI with seamless R integration. Shiny (R)
  • 17. Advanced Visualization Techniques "ggplot2", # Advanced visualization "lattice", # Trellis graphics "dplyr", # Data manipulation "tidyr", # Data wrangling "patchwork", # Combining ggplot objects "ggthemes", # Themes for ggplot "gridExtra", # Arranging multiple plots "reshape2", # Reshape data for plotting "corrplot", # Correlation plots "grid", # Basic grid graphics "scales", # Scaling in ggplot "vioplot",# Violin plots "ggforce", # Additional ggplot2 features "car", # Companion to Applied Regression "tmap", # Thematic maps "sf", # Spatial data handling "plotly", # Interactive plots "ggpubr“ # Publication-ready plots
  • 18. Hands-On Exercise : Datasets Used mtcars Description: A dataset of fuel consumption and 10 aspects of automobile design for 32 cars. Variables: mpg (Miles per gallon), wt (Weight), cyl (Cylinders), hp (Horsepower), etc. • Usage: Scatter plots, correlation matrices, and bar charts. mtcars_cor Description: Correlation matrix derived from mtcars. Variables: Pairwise correlations between all numeric columns in mtcars. • Usage: Heatmaps, correlation plots. iris Description: A dataset of 150 observations on iris flowers, with measurements for sepal and petal length/width. Variables: Sepal.Length, Sepal.Width, Species (Setosa, Versicolor, Virginica). • Usage: Trellis plots, bar plots with error bars. InsectSprays Description: Data from an agricultural experiment measuring the effectiveness of insecticides. Variables: count (Insect count), spray (Spray type, A-F). • Usage: Violin plots. Word Description: A spatial dataset containing country-level attributes, including population and life expectancy. Variables: name, population, life_exp (Life expectancy), geometry (Spatial polygons). • Usage: Thematic maps. Synthetic Datasets • Description: simple random or grouped data was created manually. • Variables: Can be customized. • Usage: Quick custom visualization.
  • 19. Key Takeaways for the Next Talk What’s Next? • Detailed hands-on session using ggplot2 • Building layered plots with ggplot2. • Customizing themes and aesthetics. • Exploring advanced visualizations (e.g., faceted plots, annotations). Preparation for the Next Talk: • Install R, R Studio and the ggplot2 package if not already done. Bring a dataset you'd like to visualize for the hands-on practice.

Editor's Notes

  • #2: A mathematician who first coined the term “exploratory data analysis,” He was right when he suggested that the idea of visualization helps us see what we have not noticed before.