Sitemap

Machine Learning for Visualization

Let’s Explore the Cutest Big Dataset

12 min readSep 28, 2018

This is a transcript of the talk I gave at OpenVisConf 2018:

Data visualization is about exposing patterns to the eye.

We are always seeking ways to tap into deeper patterns.

Patterns that feel distinctly human

Press enter or click to view image in full size

Patterns we humans can recognize but can’t articulate to a computer

Press enter or click to view image in full size
:)

And patterns we didn’t even think to look for

Press enter or click to view image in full size
computer, show me only cute owls

When exploring a new dataset we have various tools in our analysis and visualization toolkit. These tools include things like averages and summary statistics, line charts and histograms, as well as an ever expanding catalog of custom visualizations.

Press enter or click to view image in full size
some bl.ocks made with d3.js and other tools

Now I want to direct your attention to a relatively new set of tools that can change the way we explore large datasets.

Press enter or click to view image in full size
t-SNE all the things!

These tools use Machine Learning to pull out patterns for us and give us new ways to navigate our data.

I’d like to demonstrate these techniques on my favorite dataset, Quick, Draw!

Press enter or click to view image in full size
Quick, Draw!

If you haven’t had a chance to play the game, the rules of Quick, Draw! are pretty simple. The game asks you to draw a word, and you try to get an AI to guess the word from your drawing.

Press enter or click to view image in full size

When the Google Creative Lab built Quick, Draw!, they had the foresight to save anonymized copies of the drawings, altering the course of my life forever. At this point millions of people across the globe have played and Google has open sourced 50 million of the drawings they created. This means we have on average more than 100,000 drawings for each of the 300 words in the game to explore.

Press enter or click to view image in full size
some of the 300+ words in the dataset

The Data

Let’s take a close look at the data, what it is and isn’t.

Press enter or click to view image in full size
what, when and where

This dataset makes for a great demonstration, because its so fun but it is also representative of so many serious datasets. It has categorical data, such as which word is being drawn and which country the drawing originated from. We also have a few time related dimensions, such as how long the drawing took to draw (duration), a timestamp of when the drawing was made.

Press enter or click to view image in full size
how

We also have the sequence of points that make up the drawing. It is this sequence of pen strokes that carries most of the meaning in this dataset, they capture the way we as humans represent abstract concepts across the globe.

They are also the most difficult to dissect with traditional data visualization techniques.

Data Visualization

Just because something is difficult, doesn’t mean it can’t be done. Since the dataset was released several amazing projects applied various techniques to surface interesting patterns in the data.

How Long Does it Take to (Quick) Draw a Dog? by Jim Vallandingham

Press enter or click to view image in full size
breaking down the data by complexity

This project explores questions around complexity and quality by utilizing stroke counts and drawing durations. Some quite interesting observations can be surfaced by interactively browsing the summary statistics of these attributes.

Press enter or click to view image in full size
On average ducks take longer to draw than flamingos. also, owls are always cute

Note that the dimensions visualized here are the count of strokes and the duration of the drawing, both reduce the sequence of strokes to a single number. These numbers give us a manageable way to peek into the data, but can’t capture all of the features of the drawings by themselves.

How Do You Draw a Circle? by Nikhil Sonnad

Press enter or click to view image in full size
We can look not just at what we draw, but how we draw

This article takes a deep dive into simple shapes, they determine whether each circle is drawn clockwise or counter clockwise, allowing them to count this feature in the dataset.

Highlighting cultural phenomenon

They can then visualize this feature to communicate some understanding about cultural differences across the world.

“There are countless ways that we subtly, unconsciously carry our cultures with us: the way we draw, count on our fingers, and imitate real-world sounds, to name a few. That’s the delight at the heart of this massive dataset.”

Forma Fluens by Visual AI Lab @ IBM Research

Press enter or click to view image in full size
A is for Average

This project takes a number of interesting approaches to visualize the data. One in particular is the use of visual averages to highlight cultural patterns.

Press enter or click to view image in full size
oh crap I forgot my converter

Visual averages work by drawing thousands of faint transparent drawings on top of each other, surfacing the dominant patterns. This works especially well when we filter drawings by country where cultural patterns can emerge.

Visual Averages by Country by Kyle McDonald

Kyle McDonald took the concept of visual averages to an extreme in this epic tweetstorm.

Press enter or click to view image in full size
where is the soft-serve?

He makes excellent use of small multiples to compare patterns across several categories.

Press enter or click to view image in full size
finding nemo

These certainly give us some interesting points to reflect on, but it’s hard to dig deeper since all of the nuanced patterns get washed out by the averaging.

Press enter or click to view image in full size
we can’t really see anything when we average yoga. countries: USA, Korea, Germany, Brazil

So what if we had a way to capture the nuance lost by averages, to automatically find the interesting features in the strokes and dissect the data by more than one dimension at a time?

Machine Learning

Enter Deep Neural Networks. These aren’t magic, but they do have some amazing capabilities and as it turns out, we have just such a network trained on the Quick, Draw! dataset. It’s called sketch-rnn.

Press enter or click to view image in full size
You can play with this yourself over at the sketch-rnn demo page

While it’s super fun to play with the network and come up with creative applications for a drawing machine, what’s exciting for us as data visualizers is the patterns its had to encode in order to generate its drawings.

So how do we get at these patterns?

One way to do it is to ask the network how probable it thinks a given input drawing is, as Colin Morris did in his Bad Flamingos article.

Press enter or click to view image in full size
The Treachery of Machine Learning. Flamingos? ¯\_(ツ)_/¯

On top we see flamingos the network thinks are highly likely and on the bottom are flamingos it thinks are very unlikely. This gives us an interesting lens to look at the data through, but it still reduces all of the data to a single dimension. That’s a problem because some of the most interesting depictions of flamingos are mixed in with words that are clearly not a flamingo.

Press enter or click to view image in full size
what if we wanted to find bad-ass flamingos?

We would like a broader view of the data, and we can get one once we understand a little bit more about how the network operates. Sketch-rnn belongs to a family of neural networks called auto-encoders which find ways to “compress” input data into a smaller representation that can then be used to generate new output.

Press enter or click to view image in full size
The encoder takes in a drawing and compresses it into a latent vector

The network is composed of two parts, an encoder network that tries to find a way to represent the data in much fewer dimensions than the input, and a decoder network that tries to accurately reconstruct the original data using only the encoded data.

Press enter or click to view image in full size
The decoder takes the latent vector as input and outputs a new (very similar) drawing

We call the encoded data a latent vector, and it’s the key to unlocking our technique.

latent vectors

We can extract the latent vector for each drawing from the network, which gives us a way to compare the drawings numerically.

Press enter or click to view image in full size
similar faces have similar latent vectors

When we compare them we see that similar latent vectors mean similar drawings. In our network, a latent vector is 128 numbers, which is still kind of a lot to deal with. So we need a way to compare a lot of high-dimensional data points with each other.

Press enter or click to view image in full size
A map of all our faces

Luckily there is a wonderful algorithm called t-SNE which is very helpful for visualizing similarities in high-dimensional data. It isn’t a silver bullet, but it offers us a very interesting way to explore our data. Here each drawing is represented by a small translucent yellow dot, the algorithm places similar drawings close to each other to create this two dimensional map.

Press enter or click to view image in full size

We can zoom in on a small piece of this map, and see a group of similar drawings.

Press enter or click to view image in full size

As humans using our eyes, the patterns here are fairly clear, namely the eyes and smile.

Press enter or click to view image in full size

Let’s look at an entirely different cluster.

Press enter or click to view image in full size

We can see that this cluster highlights a pretty different pattern, which is kind of sad.

Press enter or click to view image in full size
Press enter or click to view image in full size
the goofy dimension

Let’s get back to the idea of studying the complexity of drawings. Instead of using proxies for complexity like number of strokes or duration of drawing we can examine complexity directly.

Press enter or click to view image in full size
cat map

Here is one way to draw a simple cat, all of these have a single stroke:

Press enter or click to view image in full size
Press enter or click to view image in full size

Here is another way to capture the essence of catness with simplicity, though this time with 3 strokes. In both cases you could put these in front of a young child and they would say that’s a cat!

Press enter or click to view image in full size
Press enter or click to view image in full size

Now we depart from metrics and get into humanities:

Press enter or click to view image in full size
Press enter or click to view image in full size

Here are some approximately equally complex cats, but clearly we are looking at whiskers and not smiles:

Press enter or click to view image in full size
Press enter or click to view image in full size

We don’t need to stop there, we can have it all!

Press enter or click to view image in full size
Press enter or click to view image in full size

So now we are navigating through a space that is much richer than a single dimension.

Let’s revisit the problem of averaging a concept like yoga.

Press enter or click to view image in full size
yoga map

The problem with averages is that they assume a normal distribution with a single mode. What we will see is that there are several modes when representing yoga, starting with the different poses.

Press enter or click to view image in full size
Press enter or click to view image in full size
don’t forget to breathe

The way people draw even a single pose:

Press enter or click to view image in full size
Press enter or click to view image in full size

And the way people give up:

Press enter or click to view image in full size
Press enter or click to view image in full size

At this point I want to pause and step back from our particular dataset of drawings and make sure we’re clear on the two things going on here.

The first is that t-SNE is a general technique for visualizing high-dimensional data.

Press enter or click to view image in full size
Visualizing Data using t-SNE by Laurens van der Maaten

The second thing is that Neural Networks can work on all kinds of data. In this figure, taken from Chris Olah’s amazing article Deep Learning for Human Beings, paragraph vectors are visualized with t-SNE to surface topics in Wikipedia articles.

Press enter or click to view image in full size

So using Neural Networks to find patterns and t-SNE to visualize them is a good idea in general.

Press enter or click to view image in full size

A more mathematical term for the high-dimensional landscape created by our neural network’s internal representation is the “latent space”. We can think of t-SNE as helping us draw a map of this space.

Press enter or click to view image in full size

Much like a 2D map can never truly represent our 3D globe, a 2D t-SNE map won’t be able to show us everything thats happening in higher dimension.

Press enter or click to view image in full size

But it can still be a very helpful way to navigate as we explore.

Press enter or click to view image in full size

Here we’ve sampled a single drawing from each grid cell and opacity indicates the number of drawings in that cell.

Press enter or click to view image in full size
simpler smiley faces
Press enter or click to view image in full size
faces with long hair, short hair or no top

Let’s briefly bring back the idea of averaging by country codes.

Press enter or click to view image in full size

With this view we can instead filter down by country code. We can take a quick look at Japanese power outlets:

Press enter or click to view image in full size

If we zoom in we see the predominant representation of a “Type A” outlet with two vertical holes. Unlike in the averages, we also get to see some fun outliers which seemed to have understood power lifting instead of power outlet

Press enter or click to view image in full size
Press enter or click to view image in full size
left: standard “type A” plugs, right: power lifting

Let’s look at another word, octopus and revisit the idea of complexity.

Press enter or click to view image in full size

We can filter our map to only octopi with 1 stroke, and sample from those areas. It’s probably easy to imagine drawing an octopus in 1 go like these.

Press enter or click to view image in full size

If we instead filter to all of the complicated octopi, defined by having more than 14 strokes:

Press enter or click to view image in full size

We find a really fun cluster

Press enter or click to view image in full size

Conclusion

The different ways in which people draw are like different notes, the harmonics of a word and the clusters we’ve explored are the result of thousands of strangers harmonizing together.

Press enter or click to view image in full size
namaste

Thanks

Press enter or click to view image in full size
tiger drawings randomly assigned to thankees

--

--

Ian Johnson
Ian Johnson

Written by Ian Johnson

pixel flipper. Data Vis Developer @ObservableHQ. formerly @Google @lever

Responses (3)