blob: da2aa0e06654061bd69d79db6760fa124448da8d [file] [log] [blame] [view]
Ryan Heise4c3160e2020-10-06 00:48:521<!-- Copyright 2020 The Chromium Authors. All rights reserved.
Juan Antonio Navarro Perez897455e2019-02-27 00:06:032 Use of this source code is governed by a BSD-style license that can be
3 found in the LICENSE file.
4-->
5
Caleb Rouleaue274b58962020-03-07 01:45:176# Chrome Benchmarking System
7
Ryan Heise4c3160e2020-10-06 00:48:528# Overview
9
Caleb Rouleaue274b58962020-03-07 01:45:1710This directory contains benchmarks and infrastructure to test Chrome and
Ryan Heise4c3160e2020-10-06 00:48:5211Chromium and output performance measurements. These benchmarks are continuously
12run on the [perf waterfall](https://ci.chromium.org/p/chrome/g/chrome.perf/console).
Caleb Rouleaue274b58962020-03-07 01:45:1713
Ryan Heise4c3160e2020-10-06 00:48:5214For more information on how Chrome measures performance, see
15[here](/docs/speed/how_does_chrome_measure_performance.md).
Caleb Rouleaue274b58962020-03-07 01:45:1716
Ryan Heise4c3160e2020-10-06 00:48:5217# Using The Chrome Benchmarking System
Caleb Rouleaue274b58962020-03-07 01:45:1718
Ryan Heise4c3160e2020-10-06 00:48:5219## Analyzing Results From The Perf Waterfall
Caleb Rouleaue274b58962020-03-07 01:45:1720
Ryan Heise4c3160e2020-10-06 00:48:5221The [ChromePerf Dashboard](https://chromeperf.appspot.com/) is the destination
22for all metrics generated by the perf waterfall. It provides tools to set up a
23dashboard for performance of a set of tests + metrics over time. In addition, it
24provides the ability to launch a bisection by selecting a point on the
25dashboard.
Caleb Rouleaue274b58962020-03-07 01:45:1726
Ryan Heise4c3160e2020-10-06 00:48:5227## Running A Single Test
Caleb Rouleaue274b58962020-03-07 01:45:1728
Ryan Heise4c3160e2020-10-06 00:48:5229The Chrome Benchmarking System has two methods for manually running performance tests:
30run_benchmark and Pinpoint.
31
32run_benchmark is useful for creating and debugging benchmarks using local
33devices. Run from the command line, it has a number of flags useful for
34determining the internal state of the benchmark. For more information, see
35[here](https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry/docs/run_benchmarks_locally.md).
36
37[Pinpoint](https://pinpoint-dot-chromeperf.appspot.com/) wraps run_benchmark and
38provides the ability to remotely run A/B benchmarks using any platform available
39in our lab. It will run a benchmark for as many iterations as needed to get a
40statistically significant result, then visualize it.
41
42## Creating New Tests (stories)
43
44[This document](https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry)
45provides an oveview of how tests are structured and some of the underlying
46technologies. After reading that doc, figure out if your story fits into an
47existing benchmark by checking
48[here](https://goto.google.com/chrome-benchmarking-sheet) (or
49[here](https://bit.ly/chrome-benchmarks) for non-Googlers).
50
51* If it does, follow the instructions next to it. If there are no instructions,
52 find the test type in src/tools/perf/page_sets.
53* Otherwise, read [this](https://docs.google.com/document/d/1ni2MIeVnlH4bTj4yvEDMVNxgL73PqK_O9_NUm3NW3BA/edit).
54
55After figuring out where your story fits, create a new one. There is a
56considerable amount of variation between different benchmarks, so use a nearby
57story as a model. You may also need to introduce custom JavaScript to drive
58interactions on the page or to deal with nondeterminsim. For an example, search
59[this file](https://source.chromium.org/chromium/chromium/src/+/master:tools/perf/page_sets/system_health/browsing_stories.py?q=browsing_stories.py&ss=chromium)
60for browse:tools:sheets:2019.
61
62Next, we need to use WPR (WebPageReplay) to record all of the content requested by the test. By default,
63tests spin up a local webserver using these recordings, removing one source of
64nondeterminism. To do that, run:
65
66```./tools/perf/record_wpr --browser=system --story-filter=STORY_NAME BENCHMARK_NAME```
67
68Next, we need to verify the recording works. To do so, run the test:
69
70```./tools/perf/run_benchmark run BENCHMARK_NAME --browser=system --story-filter=STORY_NAME ```
71
72After running this, you will need to verify the following:
73
74* Does the browser behave the same as it did when creating the recording? If not, is the difference in behavior acceptable?
75* Are there any concerning errors generated by Chrome when running run_benchmark? These will appear in the output of run_benchmark.
76* Check the benchmarks in the link generated by run_benchmark. Does everything look reasonable?
77
78If any problems were encountered, review or add custom JavaScript as described in the previous section. Alternatively, ask for help.
79
80If everything looks good, upload your WPR archive by following the instructions
81in [Upload the recording to Cloud Storage](https://sites.google.com/a/chromium.org/dev/developers/telemetry/record_a_page_set)
82and create a CL.
83
84# Tools In This Directory
Juan Antonio Navarro Perez897455e2019-02-27 00:06:0385
Dean Michael Berris6517fa742020-04-27 02:59:1586This directory contains a variety of tools that can be used to run benchmarks,
87interact with speed services, and manage performance waterfall configurations.
88It also has commands for running functional unittests.
Juan Antonio Navarro Perez897455e2019-02-27 00:06:0389
Caleb Rouleau83c95a52019-11-23 02:32:3090## run_tests
91
92This command allows you to run functional tests against the python code in this
93directory. For example, try:
94
95```
96./run_tests results_dashboard_unittest
97```
98
99Note that the positional argument can be any substring within the test name.
100
101This may require you to set up your `gsutil config` first.
102
Juan Antonio Navarro Perez897455e2019-02-27 00:06:03103## run_benchmark
104
105This command allows running benchmarks defined in the chromium repository,
106specifically in [tools/perf/benchmarks][benchmarks_dir]. If you need it,
Dean Michael Berris6517fa742020-04-27 02:59:15107documentation is available on how to [run benchmarks locally][run_locally] and
108how to properly [set up your device][device_setup].
Juan Antonio Navarro Perez897455e2019-02-27 00:06:03109
110[benchmarks_dir]: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/
111[run_locally]: https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry/docs/run_benchmarks_locally.md
112[device_setup]: /docs/speed/benchmark/telemetry_device_setup.md
113
Juan Antonio Navarro Perez2788bfb2019-08-08 20:53:46114## update_wpr
Juan Antonio Navarro Perez897455e2019-02-27 00:06:03115
Juan Antonio Navarro Perez2788bfb2019-08-08 20:53:46116A helper script to automate various tasks related to the update of
117[Web Page Recordings][wpr] for our benchmarks. In can help creating new
Dean Michael Berris6517fa742020-04-27 02:59:15118recordings from live websites, replay those to make sure they work, upload them
119to cloud storage, and finally send a CL to review with the new recordings.
Juan Antonio Navarro Perez897455e2019-02-27 00:06:03120
Juan Antonio Navarro Perez2788bfb2019-08-08 20:53:46121[wpr]: https://github.com/catapult-project/catapult/tree/master/web_page_replay_go
122
123## pinpoint_cli
124
Dean Michael Berris6517fa742020-04-27 02:59:15125A command line interface to the [pinpoint][] service. Allows to create new jobs,
126check the status of jobs, and fetch their measurements as csv files.
Juan Antonio Navarro Perez2788bfb2019-08-08 20:53:46127
128[pinpoint]: https://pinpoint-dot-chromeperf.appspot.com
129
130## flakiness_cli
131
132A command line interface to the [flakiness dashboard][].
133
134[flakiness dashboard]: https://test-results.appspot.com/dashboards/flakiness_dashboard.html
135
136## soundwave
137
138Allows to fetch data from the [Chrome Performance Dashboard][chromeperf] and
Dean Michael Berris6517fa742020-04-27 02:59:15139stores it locally on a SQLite database for further analysis and processing. It
140also allows defining [studies][], pre-sets of measurements a team is interested
141in tracking, and uploads them to cloud storage to visualize with the help of
142[Data Studio][]. This currently backs the [v8][v8_dashboard] and
Juan Antonio Navarro Perez2788bfb2019-08-08 20:53:46143[health][health_dashboard] dashboards.
144
145[chromeperf]: https://chromeperf.appspot.com/
146[studies]: https://cs.chromium.org/chromium/src/tools/perf/cli_tools/soundwave/studies/
147[Data Studio]: https://datastudio.google.com/
148[v8_dashboard]: https://datastudio.google.com/s/iNcXppkP3DI
149[health_dashboard]: https://datastudio.google.com/s/jUXfKZXXfT8
150
151## pinboard
152
Dean Michael Berris6517fa742020-04-27 02:59:15153Allows scheduling daily [pinpoint][] jobs to compare measurements with/without a
154patch being applied. This is useful for teams developing a new feature behind a
155flag, who wants to track the effects on performance as the development of their
156feature progresses. Processed data for relevant measurements is uploaded to
157cloud storage, where it can be read by [Data Studio][]. This also backs data
158displayed on the [v8][v8_dashboard] dashboard.