Interactive data visualization unit 5...

Unit-5
INTERACTIVE DATA VISUALIZATION

What is Interactive Data Visualization?
• Organizations are always looking for innovative methods to effectively
share insights and get value from their data in today’s data-rich
environment.
• With dynamic and engaging images, users may explore and comprehend
data thanks to the potent interactive data visualization technology.
• Interactive data visualization takes static data visualization a step
further by allowing users to directly interact with the data itself. It gives
users the ability to compare various datasets, dive down into specifics,
engage with data in real time, and find hidden patterns and correlations.
Organizations may enhance communication, make data-driven decisions,
and eventually get a competitive advantage by doing this.

1.Enhanced Data Understanding: Abstract data becomes more
concrete and understandable when presented via interactive
visualizations. Context, linkages, and exploration allow people
to better understand complicated ideas, see patterns, and
come to quicker, more educated judgments.
2.Enhanced Data Exploration: Users to interact with the data
making it easier for them to explore various situations, dig
deeper into the data, and find insights that they would have
missed otherwise.
3.Effective Communication: Interactive visualizations go
beyond the idea that images are a universal language. Complex
concepts may be conveyed more successfully to a variety of
audiences by presenting data in a dynamic and engaging way,
which can help to align stakeholders and foster agreement.

• Data-Driven Decision Making: By including sophisticated
analytical functions, interactive data visualization tools often
help users see patterns, correlations, and outliers more
quickly. This facilitates improved decision-making by
offering predictive analytics capabilities and actionable
insights.

• Numerous elements that increase data analysis and user experience
are available in modern interactive data visualizations:
• Filtering and Slicing: By allowing users to compare different
segments, concentrate on certain data subsets, or examine data from
many dimensions, interactive filters and slicing tools may uncover
hidden patterns.
• Updates in real time: By connecting visualizations to real-time data
sources, users can keep an eye on changes and respond to them as
they happen.
• Customizable Views: Users may rearrange dashboard components for
individualized insights, choose certain metrics, or change the style of
graphic to better suit their needs.

• Collaborative Features: A lot of modern technologies come
with collaborative features that let teams debate findings,
share and annotate visualizations, and make choices based
on data and tactics in tandem.
• Advanced Analytics: By combining statistical models,
machine learning algorithms, and predictive analytics, one
may better detect patterns, correlations, and anomalies,
which facilitates more precise forecasting and decision-
making.

Additional Benefits of Interactive Data
Visualization
• Simplified Complex Data: Visualizations help a wider audience grasp
abstract topics by turning complex facts into easily consumable images.
• Faster Insights: Users can see patterns, anomalies, and opportunities more
quickly because to the interactive visualizations, which speeds up the
exploration process and improves time-to-market and decision-making.
• Better Data Quality: Inconsistencies or outliers in the data may be found
more easily via interactive exploration, which makes data cleaning easier and
boosts overall data quality.
• Engaging Communication: Data presentations become more memorable
and captivating when they include dynamic and engaging images that hold
the audience’s attention and help them comprehend the message.

• Engaging Communication: Data presentations become
more memorable and captivating when they include
dynamic and engaging images that hold the audience’s
attention and help them comprehend the message.
• Self-Service Analytics: Encouraging users to investigate
data on their own lightens the workload for IT teams and
data analysts while promoting an organizational culture that
relies heavily on data to make decisions.

Benefits of Interactive Data
Visualizations
• An interactive data visualization allows users to engage with data in ways not
possible with static graphs, such as big data interactive visualizations.
Interactivity is the ideal solution for large amounts of data with complex data
stories, providing the ability to identify, isolate, and visualize information for
extended periods of time. Some major benefits of interactive data visualizations
include:
• Identify Trends Faster - The majority of human communication is visual as the
human brain processes graphics magnitudes faster than it does text. Direct
manipulation of analyzed data via familiar metaphors and digestible imagery
makes it easy to understand and act on valuable information.
• Identify Relationships More Effectively - The ability to narrowly focus on specific
metrics enables users to identify otherwise overlooked cause-and-effect
relationships throughout definable timeframes. This is especially useful in
identifying how daily operations affect an organization’s goals.

• Useful Data Storytelling - Humans best understand a data
story when its development over time is presented in a clear,
linear fashion. A visual data story in which users can zoom in
and out, highlight relevant information, filter, and change the
parameters promotes better understanding of the data by
presenting multiple viewpoints of the data.
• Simplify Complex Data - A large data set with a complex data
story may present itself visually as a chaotic, intertwined
hairball. Incorporating filtering and zooming controls can
help untangle and make these messes of data more
manageable, and can help users glean better insights.

Static vs Interactive Data Visualization
A static data visualization is one that does not incorporate any interaction
capabilities and does not change with time, such as an infographic focused
on a specific data story from a single viewpoint. As there are no tools to
adjust the final results of static visualizations, such as filtering and
zooming tools in interactive designs, it is essential to give great
consideration about what data is being displayed.
• A static visualization is more suited for less complex data stories, building
relationships between concepts, and conveying a predetermined view
than encouraging exploration and increasing user autonomy. Static
designs are also significantly less expensive to build than interactive
designs. Deciding whether to build a static or interactive data
visualization depends on customer preference, data story complexity,
and ROI.

• Examples of Interactive Data Visualization
• Interactive data visualizations are being used with
increasing frequency, encouraging the development of
more creative designs and providing valuable insight in
complex, real-world issues. Here are some popular and
successful interactive data visualization examples:

Axes
• Axes provide vital reference information for users to associate
data points with values, especially when data points are not
labeled directly in a chart.
• Typically charts contain two axes: an x-axis and a y-axis. In
many cases, one axis is used to establish the continuous
interval of a dataset (i.e. time), while the other axis is used to
map a data point against a value (i.e. a percentage, dollar
amount, or integer).
• Axes aren't always necessary in data visualizations, but should
be considered for use by default. When using data labels, you
can omit the axis being used for interval labels.

• Axes are perpendicular lines on a coordinate grid that are
used to measure and categorize data in a visualization:
• X-axis: The horizontal axis, which is also known as the
category axis. It arranges data points horizontally and is
often used to show categories or time.
• Y-axis: The vertical axis, which is also known as the value
axis. It represents values.
• Origin: The point where the x- and y-axis intersect.

Labels
• Labels make it easier for users to understand data
visualizations by using text to reinforce visual concepts.
• Labels are traditionally used to label axes and legends,
however, they can also be used inside of data visualizations
to communicate categories, values, or annotations.
• Where possible, labels should be used instead of legends or
tooltips to make it easier for users to understand data
visualizations.
• When using data labels for categories, such as in a line
graph, you can omit using a legend as this is the preferred
method for representing categorical data.

Annotations
• Annotations can be used to help tell the story of a data
visualization. Before choosing to use annotations, make
sure they do not reduce a user’s ability to understand the
data inside of your data visualization.

Typography
• Labels in charts and graphs should be displayed in sentence case.
• Sentence case is a type of letter casing, like uppercase or lowercase, where only the first word and
proper nouns are capitalized.
Map Labels
• Depending on the type of map being used, labels may or may not be necessary. For example, a United
States-based thematic map may not need labels if the map is showing general trends.
LEGENDS
Legends should be placed below or parallel to a data visualization, depending on the type of data being
labeled and the available space surrounding the data visualization. If using sequential or diverging data,
ensure that the legend is displayed in a vertical format next to the visualization so that the data can be
properly ordered.
When using sequential data, ensure that the highest number appears first at the top of the legend and
the lowest number appears last at the bottom of the legend. For diverging data, a vertical legend is also
recommended, with the most extreme values at opposite ends of the legend.

Titles
• Axis titles provide helpful contextual information about the tick
marks of a given axis, such as the unit of measurement. When
axis titles are used in conjunction with titles and subtitles, users
will be able to more easily understand what a visualization is
about.

Tick Marks
Tick marks are used to indicate a reference value at a given
point in a chart.
Tick marks function similar to the lines on a ruler – not all tick
marks need to be labeled, but they do need to establish a
continuous interval by ensuring the number of tick marks
between each labeled tick mark is always the same.
• When using tick marks, try not to label too many elements or
use too many marks along an axis.
• If tick marks appear cluttered, users won’t be able to
determine the value of various data points.

Comparing Visualizations
• When creating data visualizations that are designed to be
compared, always use consistent axes.
• This allows users to accurately compare the data across each
without having to consider the variances resulting from axes
with different start and end values.

SCALES
You may have thought of a scale as something to weigh yourself with .
A scale, in this sense, is a leveled range of values and numbers from
lowest to highest that measures something at regular intervals. A great
example is the classic number line that has numbers lined up at
consistent intervals along a line.

SCALES:
• Designing chart scales is crucial to creating clear and effective
data visualizations. Several factors must be considered when
choosing the right scale for chart axes:
Understand the Data
• Before selecting a scale, it’s important to thoroughly
understand the data you’re working with.
• Identify the minimum and maximum values, the range of the
data, and any distinct patterns or trends.
• This will help you choose a scale that highlights the most
significant aspects of the data.

There are two main types of scales – linear and
logarithmic.
• A linear scale is much like the number line described
• They key to this type of scale is that the value between two consecutive points on the
line does not change no matter how high or low you are on it.
• For instance, on the number line, the distance between the numbers 0 and 1 is 1 unit.
The same distance of one unit is between the numbers 100 and 101, or -100 and -101.
• However you look at it, the distance between the points is constant (unchanging)
regardless of the location on the line.
• A great way to visualize this is by looking at one of those old school Intro to Geometry
or Intermediate Algebra examples of how to graph a line.
• One of the properties of a line is that it is the shortest distance between two points.
Another is that it is has a constant slope. Having a constant slope means that the
change in x and y from one point to another point on the line doesn't change.

• Linear scale: A linear scale, also known as an arithmetic scale, represents values on an axis with equal
spacing between each tick mark.
• It is used when the data points have a consistent and uniform progression.
• A linear scale is suitable for representing data where the absolute numerical difference between values is
important, such as measuring quantities or displaying continuous data.
• A linear chart shows the same distance between the values on the y-axis. For example, a rise from 50 to 51
shows the same distance as from 100 to 101, even though the first one rises 2% and the latter rises only
1%.
• if your data has an even distribution and does not span multiple orders of magnitude, a linear scale with
single-division scale bars is generally appropriate. In this case, the intervals between values on the scale
bars are equal.
It allows for a straightforward interpretation of the data, as the differences between values are represented
proportionally.
Using single-division scale bars on a linear scale is particularly useful when the data values have a direct and
consistent relationship.
It ensures that the chart accurately represents the magnitude of the data points and enables viewers to
make precise comparisons between them.

Choose an Appropriate Scale Type
Linear scales have equal intervals between values, while
logarithmic scales have intervals that increase exponentially.
Generally, linear scales are the most suitable if your data has
an even distribution.
However, if your data spans multiple orders of magnitude, a
logarithmic scale may be more appropriate to emphasize
proportional differences.
There are several types of chart scales used in data
visualization, each serving a specific purpose.
Here are some common types of chart scales:

Logarithmic scale: A logarithmic scale represents values on an axis based on
logarithmic transformations. It compresses a wide range of values into a more
compact representation. Logarithmic scales are useful when dealing with data that
spans several orders of magnitude. They help in visualizing exponential or rapidly
changing data, such as stock prices, population growth, or earthquake magnitudes.
A scale where each tick mark represents a power of ten (10, 100, 1000).
Normalized scale: A normalized scale is a scale that has been changed to a standard
range, usually between 0 and 1, using a process called normalization. Normalization
is a popular technique for preparing data in machine learning and data science.
A normalized scale is a method of representing data where values are adjusted or
scaled to a common baseline or reference point. The purpose of a normalized scale is
to allow for easier comparison and analysis of data points relative to each other,
particularly when dealing with variables that have different units, ranges, or
magnitudes.
Proportional scale: A proportional scale represents values based on proportional
relationships or ratios. It is often used in pie charts, where each segment’s size
represents a proportion of the whole. Proportional scales are effective in visualizing
parts-to-whole relationships or percentages.

•Ordinal Scale: Ranks data without defining the exact differences between categories. Often used for categorica
•Nominal Scale: Used for categorical data without any inherent order. Each category is distinct and represented
•Time Scale: Specifically designed to represent temporal data, where the spacing between points is based on tim
•Color Scale: Maps data values to colors, which can enhance the perception of data patterns. Color scales can

• A nominal scale is the 1st
level of measurement scale in which the numbers
serve as “tags” or “labels” to classify or identify the objects. A nominal scale
usually deals with the non-numeric variables or the numbers that do not
have any value.
• Characteristics of Nominal Scale
• A nominal scale variable is classified into two or more categories. In this
measurement mechanism, the answer should fall into either of the
classes.
• It is qualitative. The numbers are used here to identify the objects.
• The numbers don’t define the object characteristics. The only permissible
aspect of numbers in the nominal scale is “counting.”
• Example:

• An example of a nominal scale measurement is given below:
• What is your gender?
• M- Male
• F- Female
• Here, the variables are used as tags, and the answer to this
question should be either M or F.

• Ordinal Scale
• The ordinal scale is the 2nd
level of measurement that reports the ordering and ranking of data without
establishing the degree of variation between them. Ordinal represents the “order.” Ordinal data is known as
qualitative data or categorical data. It can be grouped, named and also ranked.
• Characteristics of the Ordinal Scale
• The ordinal scale shows the relative ranking of the variables
• It identifies and describes the magnitude of a variable
• Along with the information provided by the nominal scale, ordinal scales give the rankings of those variables
• The interval properties are not known
• The surveyors can quickly analyse the degree of agreement concerning the identified order of variables
• Example:
• Ranking of school students – 1st, 2nd, 3rd, etc.
• Ratings in restaurants
• Evaluating the frequency of occurrences
• Very often
• Often
• Not often

• Ratio Scale
• The ratio scale is the 4th
level of measurement scale, which is quantitative. It is a type of variable
measurement scale. It allows researchers to compare the differences or intervals. The ratio scale has a
unique feature. It possesses the character of the origin or zero points.
• Characteristics of Ratio Scale:
• Ratio scale has a feature of absolute zero
• It doesn’t have negative numbers, because of its zero-point feature
• It affords unique opportunities for statistical analysis. The variables can be orderly added, subtracted,
multiplied, divided. Mean, median, and mode can be calculated using the ratio scale.
• Ratio scale has unique and useful properties. One such feature is that it allows unit conversions like kilogram
– calories, gram – calories, etc.
• Example:
• An example of a ratio scale is:
• What is your weight in Kgs?
• Less than 55 kgs
• 55 – 75 kgs

• the stand is selling 100 additional apples each month! Business is booming.
To showcase this success, you want to make a bar chart illustrating the steep
upward climb of apple sales, with each data value corresponding to the
height of one bar.
• Until now, we’ve used data values directly as display values, ignoring unit
differences. So if 500 apples were sold, the corresponding bar would be 500
pixels tall.
• That could work, but what about next month, when 600 apples are sold? And
a year later, when 1,800 apples are sold? Your audience would have to
purchase ever-larger displays just to be able to see the full height of those
very tall apple bars! (Mmm, apple bars!)
• This is where scales come in. Because apples are not pixels (which are also
not oranges), we need scales to translate between them.

• Domains and Ranges
• A scale’s input domain is the range of possible input data values.
Given the preceding apples data, appropriate input domains
would be either 100 and 500 (the minimum and maximum values
of the dataset) or 0 and 500.
• A scale’s output range is the range of possible output values,
commonly used as display values in pixel units. The output range
is completely up to you, as the information designer. If you decide
the shortest apple bar will be 10 pixels tall, and the tallest will be
350 pixels tall, then you could set an output range of 10 and 350.
• For example, create a scale with an input domai

create a scale with an input domain of [100,500] and an
output range of [10,350].
If you handed the low input value of 100 to that scale, it
would return its lowest range value, 10.
If you gave it 500, it would spit back 350. If you gave
it 300, it would hand 180 back to you on a silver platter.
(300 is in the center of the domain, and 180 is in the
center of the range.)
We can visualize the domain and range as corresponding
axes, side-by-side, displayed.

• If you’re familiar with the concept of normalization, it might
be helpful to know that, with a linear scale, that’s all that is
really going on here.
• Normalization is the process of mapping a numeric value to
a new value between 0 and 1, based on the possible
minimum and maximum values. For example, with 365 days
in the year, day number 310 maps to about 0.85, or 85
percent of the way through the year.
• With linear scales, we are just letting D3 handle the math of
the normalization process. The input value is normalized
according to the domain, and then the normalized value is
scaled to the output range.

Creating a Scale
D3’s scale function generators are accessed with d3.scale followed by the type of scale you followed by the
type of scale you want.
var scale = d3.scale.linear();
scale is a function to which you can pass input values.

• Because we haven’t set a domain and a range yet, this function will
map input to output on a 1:1 scale. That is, whatever we input will be
returned unchanged.
• We can set the scale’s input domain to 100,500 by passing those
values to the domain() method as an array. Note the hard brackets
indicating an array:

• scale.domain([100, 500]);
Set the output range in similar fashion, with range():
scale.range([10, 350]);

What Is a Time-Series Plot?
• A time-series plot, also known as a time plot, is a type of graph that
displays data points collected in a time sequence. In a time-series plot, the
x-axis represents the time, and the y-axis represents the variable being
measured. We use time plots in many fields, such as economics, finance,
engineering, and meteorology, to visualize and analyze changes over time.
Examples of time-series plots
• Time-series plots allow you to see trends and patterns in data that might
not be visible in other types of graphs. For instance, you can see how a
particular variable changes over months, seasons, years, or even decades.
This way, you can identify seasonal fluctuations, long-term trends, and
cyclic patterns in data.

Website traffic over time
Let's say you want to analyze the traffic to your website over
the past year. You can create a time-series plot of the daily or
monthly website visits, with the x-axis representing time and
the y-axis representing the number of visits. Here's an
example time plot of the monthly website traffic to a fictional
website over the past year:

• The data shows a higher traffic pattern during the year's middle
months, with the peak in traffic occurring in August. The traffic then
gradually declines towards the end of the year, with a sharp
decrease in traffic during the last two months. The graph also shows
some fluctuations in traffic, but overall the trend is a seasonal
pattern of higher traffic during the summer months.
• Let's say you want to visualize the stock prices of a particular
company over the past year. You can create a time-series plot of the
daily closing prices of the stock, with the x-axis representing time (in
days) and the y-axis representing the stock's closing price. Here's an
example time-series plot of the daily closing prices of Apple Inc.
stock (AAPL) over the past year:

Categorical Scale
•Definition: Maps discrete categories to visual properties.
•Usage: Used for qualitative data where values are distinct and non-numeric (e.g., names, colors).
•Example: A bar chart where each bar represents a different category (e.g., different species of plants).
• Ordinal Scale
• Definition: Similar to categorical scales but includes an order.
• Usage: Useful for data with a clear ranking (e.g., survey ratings like “poor,” “average,” “good”).
• Example: A scale that assigns positions based on rank (1st, 2nd, 3rd).
• Time Scale
• Definition: Specializes in mapping dates and times.
• Usage: Often used in time series data to show trends over time.
• Example: A timeline where each point corresponds to a specific date.
• When creating scales, you typically define:
• Domain: The input data range (e.g., the minimum and maximum values).
• Range: The output range in the visualization (e.g., pixel positions).

Range
• Definition: The range refers to the output values corresponding to the
input values defined by the domain. It determines how the input data is
represented visually, usually in terms of pixel values or positions on a
graph.
• Characteristics of Range:
• Visual Representation: The range translates data values into positions
or sizes in the visualization.
• Mapping: Each input value from the domain is mapped to a specific
output value in the range.

• a line chart that visualizes the relationship between temperature (°C)
and ice cream sales (units sold).
• Data
•Domain:
•For this example, the domain represents the temperature values.
•Domain: [0, 40] (the range of temperatures in the dataset).
•Range:
•The range corresponds to the ice cream sales values.
•Range: [10, 600] (the range of ice cream sales corresponding to
the temperatures).

• Example: If you’re visualizing the number of COVID cases
over time, a linear scale might be appropriate for showing
the even distribution of daily cases. However, a logarithmic
scale could emphasize these proportional differences if you
want to highlight how much cases have grown and
decreased over time.
• Categorical Scales: Used for discrete categories (e.g., colors for
different species). Each category gets a specific position on the axis.

Interactive data visualization unit 5...

More Related Content

Similar to Interactive data visualization unit 5...

Recently uploaded

Interactive data visualization unit 5...