Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 1 | <!-- Copyright 2020 The Chromium Authors. All rights reserved. |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 2 | Use of this source code is governed by a BSD-style license that can be |
| 3 | found in the LICENSE file. |
| 4 | --> |
| 5 | |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 6 | # Chrome Benchmarking System |
| 7 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 8 | # Overview |
| 9 | |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 10 | This directory contains benchmarks and infrastructure to test Chrome and |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 11 | Chromium and output performance measurements. These benchmarks are continuously |
| 12 | run on the [perf waterfall](https://ci.chromium.org/p/chrome/g/chrome.perf/console). |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 13 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 14 | For more information on how Chrome measures performance, see |
| 15 | [here](/docs/speed/how_does_chrome_measure_performance.md). |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 16 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 17 | # Using The Chrome Benchmarking System |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 18 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 19 | ## Analyzing Results From The Perf Waterfall |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 20 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 21 | The [ChromePerf Dashboard](https://chromeperf.appspot.com/) is the destination |
| 22 | for all metrics generated by the perf waterfall. It provides tools to set up a |
| 23 | dashboard for performance of a set of tests + metrics over time. In addition, it |
| 24 | provides the ability to launch a bisection by selecting a point on the |
| 25 | dashboard. |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 26 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 27 | ## Running A Single Test |
Caleb Rouleau | e274b5896 | 2020-03-07 01:45:17 | [diff] [blame] | 28 | |
Ryan Heise | 4c3160e | 2020-10-06 00:48:52 | [diff] [blame^] | 29 | The Chrome Benchmarking System has two methods for manually running performance tests: |
| 30 | run_benchmark and Pinpoint. |
| 31 | |
| 32 | run_benchmark is useful for creating and debugging benchmarks using local |
| 33 | devices. Run from the command line, it has a number of flags useful for |
| 34 | determining the internal state of the benchmark. For more information, see |
| 35 | [here](https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry/docs/run_benchmarks_locally.md). |
| 36 | |
| 37 | [Pinpoint](https://pinpoint-dot-chromeperf.appspot.com/) wraps run_benchmark and |
| 38 | provides the ability to remotely run A/B benchmarks using any platform available |
| 39 | in our lab. It will run a benchmark for as many iterations as needed to get a |
| 40 | statistically significant result, then visualize it. |
| 41 | |
| 42 | ## Creating New Tests (stories) |
| 43 | |
| 44 | [This document](https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry) |
| 45 | provides an oveview of how tests are structured and some of the underlying |
| 46 | technologies. After reading that doc, figure out if your story fits into an |
| 47 | existing benchmark by checking |
| 48 | [here](https://goto.google.com/chrome-benchmarking-sheet) (or |
| 49 | [here](https://bit.ly/chrome-benchmarks) for non-Googlers). |
| 50 | |
| 51 | * If it does, follow the instructions next to it. If there are no instructions, |
| 52 | find the test type in src/tools/perf/page_sets. |
| 53 | * Otherwise, read [this](https://docs.google.com/document/d/1ni2MIeVnlH4bTj4yvEDMVNxgL73PqK_O9_NUm3NW3BA/edit). |
| 54 | |
| 55 | After figuring out where your story fits, create a new one. There is a |
| 56 | considerable amount of variation between different benchmarks, so use a nearby |
| 57 | story as a model. You may also need to introduce custom JavaScript to drive |
| 58 | interactions on the page or to deal with nondeterminsim. For an example, search |
| 59 | [this file](https://source.chromium.org/chromium/chromium/src/+/master:tools/perf/page_sets/system_health/browsing_stories.py?q=browsing_stories.py&ss=chromium) |
| 60 | for browse:tools:sheets:2019. |
| 61 | |
| 62 | Next, we need to use WPR (WebPageReplay) to record all of the content requested by the test. By default, |
| 63 | tests spin up a local webserver using these recordings, removing one source of |
| 64 | nondeterminism. To do that, run: |
| 65 | |
| 66 | ```./tools/perf/record_wpr --browser=system --story-filter=STORY_NAME BENCHMARK_NAME``` |
| 67 | |
| 68 | Next, we need to verify the recording works. To do so, run the test: |
| 69 | |
| 70 | ```./tools/perf/run_benchmark run BENCHMARK_NAME --browser=system --story-filter=STORY_NAME ``` |
| 71 | |
| 72 | After running this, you will need to verify the following: |
| 73 | |
| 74 | * Does the browser behave the same as it did when creating the recording? If not, is the difference in behavior acceptable? |
| 75 | * Are there any concerning errors generated by Chrome when running run_benchmark? These will appear in the output of run_benchmark. |
| 76 | * Check the benchmarks in the link generated by run_benchmark. Does everything look reasonable? |
| 77 | |
| 78 | If any problems were encountered, review or add custom JavaScript as described in the previous section. Alternatively, ask for help. |
| 79 | |
| 80 | If everything looks good, upload your WPR archive by following the instructions |
| 81 | in [Upload the recording to Cloud Storage](https://sites.google.com/a/chromium.org/dev/developers/telemetry/record_a_page_set) |
| 82 | and create a CL. |
| 83 | |
| 84 | # Tools In This Directory |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 85 | |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 86 | This directory contains a variety of tools that can be used to run benchmarks, |
| 87 | interact with speed services, and manage performance waterfall configurations. |
| 88 | It also has commands for running functional unittests. |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 89 | |
Caleb Rouleau | 83c95a5 | 2019-11-23 02:32:30 | [diff] [blame] | 90 | ## run_tests |
| 91 | |
| 92 | This command allows you to run functional tests against the python code in this |
| 93 | directory. For example, try: |
| 94 | |
| 95 | ``` |
| 96 | ./run_tests results_dashboard_unittest |
| 97 | ``` |
| 98 | |
| 99 | Note that the positional argument can be any substring within the test name. |
| 100 | |
| 101 | This may require you to set up your `gsutil config` first. |
| 102 | |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 103 | ## run_benchmark |
| 104 | |
| 105 | This command allows running benchmarks defined in the chromium repository, |
| 106 | specifically in [tools/perf/benchmarks][benchmarks_dir]. If you need it, |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 107 | documentation is available on how to [run benchmarks locally][run_locally] and |
| 108 | how to properly [set up your device][device_setup]. |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 109 | |
| 110 | [benchmarks_dir]: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/ |
| 111 | [run_locally]: https://chromium.googlesource.com/catapult.git/+/HEAD/telemetry/docs/run_benchmarks_locally.md |
| 112 | [device_setup]: /docs/speed/benchmark/telemetry_device_setup.md |
| 113 | |
Juan Antonio Navarro Perez | 2788bfb | 2019-08-08 20:53:46 | [diff] [blame] | 114 | ## update_wpr |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 115 | |
Juan Antonio Navarro Perez | 2788bfb | 2019-08-08 20:53:46 | [diff] [blame] | 116 | A helper script to automate various tasks related to the update of |
| 117 | [Web Page Recordings][wpr] for our benchmarks. In can help creating new |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 118 | recordings from live websites, replay those to make sure they work, upload them |
| 119 | to cloud storage, and finally send a CL to review with the new recordings. |
Juan Antonio Navarro Perez | 897455e | 2019-02-27 00:06:03 | [diff] [blame] | 120 | |
Juan Antonio Navarro Perez | 2788bfb | 2019-08-08 20:53:46 | [diff] [blame] | 121 | [wpr]: https://github.com/catapult-project/catapult/tree/master/web_page_replay_go |
| 122 | |
| 123 | ## pinpoint_cli |
| 124 | |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 125 | A command line interface to the [pinpoint][] service. Allows to create new jobs, |
| 126 | check the status of jobs, and fetch their measurements as csv files. |
Juan Antonio Navarro Perez | 2788bfb | 2019-08-08 20:53:46 | [diff] [blame] | 127 | |
| 128 | [pinpoint]: https://pinpoint-dot-chromeperf.appspot.com |
| 129 | |
| 130 | ## flakiness_cli |
| 131 | |
| 132 | A command line interface to the [flakiness dashboard][]. |
| 133 | |
| 134 | [flakiness dashboard]: https://test-results.appspot.com/dashboards/flakiness_dashboard.html |
| 135 | |
| 136 | ## soundwave |
| 137 | |
| 138 | Allows to fetch data from the [Chrome Performance Dashboard][chromeperf] and |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 139 | stores it locally on a SQLite database for further analysis and processing. It |
| 140 | also allows defining [studies][], pre-sets of measurements a team is interested |
| 141 | in tracking, and uploads them to cloud storage to visualize with the help of |
| 142 | [Data Studio][]. This currently backs the [v8][v8_dashboard] and |
Juan Antonio Navarro Perez | 2788bfb | 2019-08-08 20:53:46 | [diff] [blame] | 143 | [health][health_dashboard] dashboards. |
| 144 | |
| 145 | [chromeperf]: https://chromeperf.appspot.com/ |
| 146 | [studies]: https://cs.chromium.org/chromium/src/tools/perf/cli_tools/soundwave/studies/ |
| 147 | [Data Studio]: https://datastudio.google.com/ |
| 148 | [v8_dashboard]: https://datastudio.google.com/s/iNcXppkP3DI |
| 149 | [health_dashboard]: https://datastudio.google.com/s/jUXfKZXXfT8 |
| 150 | |
| 151 | ## pinboard |
| 152 | |
Dean Michael Berris | 6517fa74 | 2020-04-27 02:59:15 | [diff] [blame] | 153 | Allows scheduling daily [pinpoint][] jobs to compare measurements with/without a |
| 154 | patch being applied. This is useful for teams developing a new feature behind a |
| 155 | flag, who wants to track the effects on performance as the development of their |
| 156 | feature progresses. Processed data for relevant measurements is uploaded to |
| 157 | cloud storage, where it can be read by [Data Studio][]. This also backs data |
| 158 | displayed on the [v8][v8_dashboard] dashboard. |