Skip to content

Commit 3074b97

Browse files
committed
Merge commit for internal changes
2 parents 83faf48 + 2261e5a commit 3074b97

File tree

2 files changed

+244
-0
lines changed

2 files changed

+244
-0
lines changed

get_started/leftnav_files

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ monitors.md
99
summaries_and_tensorboard.md
1010
embedding_viz.md
1111
graph_viz.md
12+
tensorboard_histograms.md

get_started/tensorboard_histograms.md

Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
# TensorBoard Histogram Dashboard
2+
3+
The TensorBoard Histogram Dashboard displays how the distribution of some
4+
`Tensor` in your TensorFlow graph has changed over time. It does this by showing
5+
many histograms visualizations of your tensor at different points in time.
6+
7+
## A Basic Example
8+
9+
Let's start with a simple case: a normally-distributed variable, where the mean
10+
shifts over time.
11+
TensorFlow has an op
12+
[`tf.random_normal`](https://www.tensorflow.org/api_docs/python/tf/random_normal)
13+
which is perfect for this purpose. As is usually the case with TensorBoard, we
14+
will ingest data using a summary op; in this case,
15+
['tf.summary.histogram'](https://www.tensorflow.org/api_docs/python/tf/summary/histogram).
16+
For a primer on how summaries work, please see the general
17+
[TensorBoard tutorial](https://www.tensorflow.org/get_started/summaries_and_tensorboard).
18+
19+
Here is a code snippet that will generate some histogram summaries containing
20+
normally distributed data, where the mean of the distribution increases over
21+
time.
22+
23+
```python
24+
import tensorflow as tf
25+
26+
k = tf.placeholder(tf.float32)
27+
28+
# Make a normal distribution, with a shifting mean
29+
mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1)
30+
# Record that distribution into a histogram summary
31+
tf.summary.histogram("normal/moving_mean", mean_moving_normal)
32+
33+
# Setup a session and summary writer
34+
sess = tf.Session()
35+
writer = tf.summary.FileWriter("/tmp/histogram_example")
36+
37+
# Setup a loop and write the summaries to disk
38+
N = 400
39+
for step in range(N):
40+
k_val = step/float(N)
41+
summ = sess.run(summaries, feed_dict={k: k_val})
42+
writer.add_summary(summ, global_step=step)
43+
```
44+
45+
Once that code runs, we can load the data into TensorBoard via the command line:
46+
47+
48+
```sh
49+
tensorboard --logdir=/tmp/histogram_example
50+
```
51+
52+
Once TensorBoard is running, load it in Chrome or Firefox and navigate to the
53+
Histogram Dashboard. Then we can see a histogram visualization for our normally
54+
distributed data.
55+
56+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/1_moving_mean.png)
57+
58+
`tf.summary.histogram` takes an arbitrarily sized and shaped Tensor, and
59+
compresses it into a histogram data structure consisting of many bins with
60+
widths and counts. For example, let's say we want to organize the numbers
61+
`[0.5, 1.1, 1.3, 2.2, 2.9, 2.99]` into bins. We could make three bins:
62+
* a bin
63+
containing everything from 0 to 1 (it would contain one element, 0.5),
64+
* a bin
65+
containing everything from 1-2 (it would contain two elements, 1.1 and 1.3),
66+
* a bin containing everything from 2-3 (it would contain three elements: 2.2,
67+
2.9 and 2.99).
68+
69+
TensorFlow uses a similar approach to create bins, but unlike in our example, it
70+
doesn't create integer bins. For large, sparse datasets, that might result in
71+
many thousands of bins.
72+
Instead, [the bins are exponentially distributed, with many bins close to 0 and
73+
comparatively few bins for very large numbers.](https://github.com/tensorflow/tensorflow/blob/c8b59c046895fa5b6d79f73e0b5817330fcfbfc1/tensorflow/core/lib/histogram/histogram.cc#L28)
74+
However, visualizing exponentially-distributed bins is tricky; if height is used
75+
to encode count, then wider bins take more space, even if they have the same
76+
number of elements. Conversely, encoding count in the area makes height
77+
comparisons impossible. Instead, the histograms [resample the data](https://github.com/tensorflow/tensorflow/blob/17c47804b86e340203d451125a721310033710f1/tensorflow/tensorboard/components/tf_backend/backend.ts#L400)
78+
into uniform bins. This can lead to unfortunate artifacts in some cases.
79+
80+
Each slice in the histogram visualizer displays a single histogram.
81+
The slices are organized by step;
82+
older slices (e.g. step 0) are further "back" and darker, while newer slices
83+
(e.g. step 400) are close to the foreground, and lighter in color.
84+
The y-axis on the right shows the step number.
85+
86+
You can mouse over the histogram to see tooltips with some more detailed
87+
information. For example, in the following image we can see that the histogram
88+
at timestep 176 has a bin centered at 2.25 with 177 elements in that bin.
89+
90+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/2_moving_mean_tooltip.png)
91+
92+
Also, you may note that the histogram slices are not always evenly spaced in
93+
step count or time. This is because TensorBoard uses
94+
[reservoir sampling](https://en.wikipedia.org/wiki/Reservoir_sampling) to keep a
95+
subset of all the histograms, to save on memory. Reservoir sampling guarantees
96+
that every sample has an equal likelihood of being included, but because it is
97+
a randomized algorithm, the samples chosen don't occur at even steps.
98+
99+
## Overlay Mode
100+
101+
There is a control on the left of the dashboard that allows you to toggle the
102+
histogram mode from "offset" to "overlay":
103+
104+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/3_overlay_offset.png)
105+
106+
In "offset" mode, the visualization rotates 45 degrees, so that the individual
107+
histogram slices are no longer spread out in time, but instead are all plotted
108+
on the same y-axis.
109+
110+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/4_overlay.png)
111+
Now, each slice is a separate line on the chart, and the y-axis shows the item
112+
count within each bucket. Darker lines are older, earlier steps, and lighter
113+
lines are more recent, later steps. Once again, you can mouse over the chart to
114+
see some additional information.
115+
116+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/5_overlay_tooltips.png)
117+
118+
In general, the overlay visualization is useful if you want to directly compare
119+
the counts of different histograms.
120+
121+
## Multimodal Distributions
122+
123+
The Histogram Dashboard is great for visualizing multimodal
124+
distributions. Let's construct a simple bimodal distribution by concatenating
125+
the outputs from two different normal distributions. The code will look like
126+
this:
127+
128+
```python
129+
import tensorflow as tf
130+
131+
k = tf.placeholder(tf.float32)
132+
133+
# Make a normal distribution, with a shifting mean
134+
mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1)
135+
# Record that distribution into a histogram summary
136+
tf.summary.histogram("normal/moving_mean", mean_moving_normal)
137+
138+
# Make a normal distribution with shrinking variance
139+
variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k))
140+
# Record that distribution too
141+
tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal)
142+
143+
# Let's combine both of those distributions into one dataset
144+
normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0)
145+
# We add another histogram summary to record the combined distribution
146+
tf.summary.histogram("normal/bimodal", normal_combined)
147+
148+
summaries = tf.summary.merge_all()
149+
150+
# Setup a session and summary writer
151+
sess = tf.Session()
152+
writer = tf.summary.FileWriter("/tmp/histogram_example")
153+
154+
# Setup a loop and write the summaries to disk
155+
N = 400
156+
for step in range(N):
157+
k_val = step/float(N)
158+
summ = sess.run(summaries, feed_dict={k: k_val})
159+
writer.add_summary(summ, global_step=step)
160+
```
161+
162+
You already remember our "moving mean" normal distribution from the example
163+
above. Now we also have a "shrinking variance" distribution. Side-by-side, they
164+
look like this:
165+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/6_two_distributions.png)
166+
167+
When we concatenate them, we get a chart that clearly reveals the divergent,
168+
bimodal structure:
169+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/7_bimodal.png)
170+
171+
## Some more distributions
172+
173+
Just for fun, let's generate and visualize a few more distributions, and then
174+
combine them all into one chart. Here's the code we'll use:
175+
176+
```python
177+
import tensorflow as tf
178+
179+
k = tf.placeholder(tf.float32)
180+
181+
# Make a normal distribution, with a shifting mean
182+
mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1)
183+
# Record that distribution into a histogram summary
184+
tf.summary.histogram("normal/moving_mean", mean_moving_normal)
185+
186+
# Make a normal distribution with shrinking variance
187+
variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k))
188+
# Record that distribution too
189+
tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal)
190+
191+
# Let's combine both of those distributions into one dataset
192+
normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0)
193+
# We add another histogram summary to record the combined distribution
194+
tf.summary.histogram("normal/bimodal", normal_combined)
195+
196+
# Add a gamma distribution
197+
gamma = tf.random_gamma(shape=[1000], alpha=k)
198+
tf.summary.histogram("gamma", gamma)
199+
200+
# And a poisson distribution
201+
poisson = tf.random_poisson(shape=[1000], lam=k)
202+
tf.summary.histogram("poisson", poisson)
203+
204+
# And a uniform distribution
205+
uniform = tf.random_uniform(shape=[1000], maxval=k*10)
206+
tf.summary.histogram("uniform", uniform)
207+
208+
# Finally, combine everything together!
209+
all_distributions = [mean_moving_normal, variance_shrinking_normal,
210+
gamma, poisson, uniform]
211+
all_combined = tf.concat(all_distributions, 0)
212+
tf.summary.histogram("all_combined", all_combined)
213+
214+
summaries = tf.summary.merge_all()
215+
216+
# Setup a session and summary writer
217+
sess = tf.Session()
218+
writer = tf.summary.FileWriter("/tmp/histogram_example")
219+
220+
# Setup a loop and write the summaries to disk
221+
N = 400
222+
for step in range(N):
223+
k_val = step/float(N)
224+
summ = sess.run(summaries, feed_dict={k: k_val})
225+
writer.add_summary(summ, global_step=step)
226+
```
227+
### Gamma Distribution
228+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/8_gamma.png)
229+
230+
### Uniform Distribution
231+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/9_uniform.png)
232+
233+
### Poisson Distribution
234+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/10_poisson.png)
235+
The poisson distribution is defined over the integers. So, all of the values
236+
being generated are perfect integers. The histogram compression moves the data
237+
into floating-point bins, causing the visualization to show little
238+
bumps over the integer values rather than perfect spikes.
239+
240+
### All Together Now
241+
Finally, we can concatenate all of the data into one funny-looking curve.
242+
![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/11_all_combined.png)
243+

0 commit comments

Comments
 (0)