Stephen McGruer | f889c89 | 2020-10-08 19:05:30 | [diff] [blame] | 1 | # Addressing Flaky Web Tests |
| 2 | |
Stephen McGruer | b71d507 | 2020-11-05 20:18:59 | [diff] [blame] | 3 | This document provides tips and tricks for reproducing and debugging flakes in |
| 4 | [Web Tests](web_tests.md). If you are debugging a flaky Web Platform Test (WPT), |
| 5 | you may wish to check the specific [Addressing Flaky |
| 6 | WPTs](web_platform_tests_addressing_flake.md) documentation. |
| 7 | |
| 8 | This document assumes you are familiar with running Web Tests via |
| 9 | `run_web_tests.py`; if you are not then [see |
| 10 | here](web_tests.md#Running-Web-Tests). |
| 11 | |
| 12 | [TOC] |
| 13 | |
Stephen McGruer | f889c89 | 2020-10-08 19:05:30 | [diff] [blame] | 14 | ## Understanding builder results |
| 15 | |
| 16 | Often (e.g. by Flake Portal), you will be pointed to a particular build in which |
| 17 | your test has flaked. You will need the name of the specific build step that has |
| 18 | flaked; usually for Web Tests this is `blink_web_tests` but there are variations |
| 19 | (e.g. `not_site_per_process_blink_web_tests`). |
| 20 | |
| 21 | On the builder page, find the appropriate step: |
| 22 | |
| 23 | ![web_tests_blink_web_tests_step] |
| 24 | |
| 25 | While you can examine the individual shard logs to find your test output, it is |
| 26 | easier to view the consolidated information, so scroll down to the **archive |
| 27 | results for blink\_web\_tests** step and click the `layout_test_results` link: |
| 28 | |
| 29 | ![web_tests_archive_blink_web_tests_step] |
| 30 | |
| 31 | This will open a new tab with the results viewer. By default your test should be |
| 32 | shown, but if it isn't then you can click the 'All' button in the 'Query' row, |
| 33 | then enter the test filename in the textbox beside 'Filters': |
| 34 | |
| 35 | ![web_tests_results_viewer_query_filter] |
| 36 | |
| 37 | There are a few ways that a Web Test can flake, and what the result means may |
| 38 | depend on the [test type](writing_web_tests.md#Test-Types): |
| 39 | |
| 40 | 1. `FAIL` - the test failed. For reference or pixel tests, this means it did not |
| 41 | match the reference image. For JavaScript tests, the test either failed an |
| 42 | assertion *or* did not match the [baseline](web_test_expectations.md) |
| 43 | `-expected.txt` file checked in for it. |
| 44 | * For image tests, this status is reported as `IMAGE` (as in an image diff). |
| 45 | * For Javascript tests, this status is reported as `TEXT` (as in a text |
| 46 | diff). |
| 47 | 1. `TIMEOUT` - the test timed out before producing a result. This may happen if |
| 48 | the test is slow and normally runs close to the timeout limit, but is usually |
| 49 | caused by waiting on an event that never happens. These unfortunately [do not |
| 50 | produce any logs](https://crbug.com/487051). |
| 51 | 1. `CRASH` - the browser crashed while executing the test. There should be logs |
| 52 | associated with the crash available. |
| 53 | 1. `PASS` - this can happen! Web Tests can be marked as [expected to |
| 54 | fail](web_test_expectations.md), and if they then pass then that is an |
| 55 | unexpected result, aka a potential flake. |
| 56 | |
| 57 | Clicking on the test row anywhere *except* the test name (which is a link to the |
| 58 | test itself) will expand the entry to show information about the failure result, |
| 59 | including actual/expected results and browser logs if they exist. |
| 60 | |
| 61 | In the following example, our flaky test has a `FAIL` result which is a flake |
| 62 | compared to its (default) expected `PASS` result. The test results (`TEXT` - as |
| 63 | explained above this is equivalent to `FAIL`), output, and browser log links are |
| 64 | highlighted. |
| 65 | |
| 66 | ![web_tests_results_viewer_flaky_test] |
| 67 | |
| 68 | ## Reproducing Web Test flakes |
| 69 | |
Stephen McGruer | b71d507 | 2020-11-05 20:18:59 | [diff] [blame] | 70 | >TODO: document how to get the args.gn that the bot used |
| 71 | |
| 72 | >TODO: document how to get the flags that the bot passed to `run_web_tests.py` |
| 73 | |
| 74 | ### Repeatedly running tests |
| 75 | |
| 76 | Flakes are by definition non-deterministic, so it may be necessary to run the |
| 77 | test or set of tests repeatedly to reproduce the failure. Two flags to |
| 78 | `run_web_tests.py` can help with this: |
| 79 | |
| 80 | * `--repeat-each=N` - repeats each test in the test set N times. Given a set of |
| 81 | tests A, B, and C, `--repeat-each=3` will run AAABBBCCC. |
| 82 | * `--iterations=N` - repeats the entire test set N times. Given a set of tests |
| 83 | A, B, and C, `--iterations=3` will run ABCABCABC. |
Stephen McGruer | f889c89 | 2020-10-08 19:05:30 | [diff] [blame] | 84 | |
| 85 | ## Debugging flaky Web Tests |
| 86 | |
Stephen McGruer | b71d507 | 2020-11-05 20:18:59 | [diff] [blame] | 87 | >TODO: document how to attach gdb |
| 88 | |
| 89 | ### Seeing logs from content\_shell |
| 90 | |
| 91 | When debugging flaky tests, it can be useful to add `LOG` statements to your |
| 92 | code to quickly understand test state. In order to see these logs when using |
| 93 | `run_web_tests.py`, pass the `--driver-logging` flag: |
| 94 | |
| 95 | ``` |
| 96 | ./third_party/blink/tools/run_web_tests.py --driver-logging path/to/test.html |
| 97 | ``` |
| 98 | |
| 99 | ### Loading the test directly in content\_shell |
| 100 | |
| 101 | When debugging a specific test, it can be useful to skip `run_web_tests.py` and |
| 102 | directly run the test under `content_shell` in an interactive session. For many |
| 103 | tests, one can just pass the test path to `content_shell`: |
| 104 | |
| 105 | ``` |
| 106 | out/Default/content_shell third_party/blink/web_tests/path/to/test.html |
| 107 | ``` |
| 108 | |
| 109 | **Caveat**: running tests like this is not equivalent to `run_web_tests.py`, |
| 110 | which passes the `--run-web-tests` flag to `content_shell`. The |
| 111 | `--run-web-tests` flag enables a lot of testing-only code in `content_shell`, |
| 112 | but also runs in a non-interactive mode. |
| 113 | |
| 114 | Useful flags to pass to get `content_shell` closer to the `--run-web-tests` mode |
| 115 | include: |
| 116 | |
| 117 | * `--enable-blink-test-features` - enables status=test and status=experimental |
| 118 | features from `runtime_enabled_features.json5`. |
| 119 | |
| 120 | >TODO: document how to deal with tests that require a server to be running |
Stephen McGruer | f889c89 | 2020-10-08 19:05:30 | [diff] [blame] | 121 | |
| 122 | [web_tests_blink_web_tests_step]: images/web_tests_blink_web_tests_step.png |
| 123 | [web_tests_archive_blink_web_tests_step]: images/web_tests_archive_blink_web_tests_step.png |
| 124 | [web_tests_results_viewer_query_filter]: images/web_tests_results_viewer_query_filter.png |
| 125 | [web_tests_results_viewer_flaky_test]: images/web_tests_results_viewer_flaky_test.png |