| Alex Mineer | 2be39f7 | 2017-06-13 16:21:30 | [diff] [blame] | 1 | # Release Blockers |
| 2 | |
| 3 | [TOC] |
| 4 | |
| 5 | ## tl;dr |
| 6 | |
| 7 | * Only mark bugs as blockers if the product **must not** be shipped with the |
| 8 | bug present. |
| 9 | * **Everyone** on the team can add or remove blocking labels. |
| 10 | * Evaluate bugs as potential blockers based on their **severity** and |
| 11 | **prevalence**. |
| 12 | * Provide **detailed rationale** whenever adding or removing a blocking label. |
| 13 | * Ensure all blockers have an **OS and milestone** tagged. |
| 14 | * Release owners have final say on blocking status; contact them with any |
| 15 | questions. |
| 16 | |
| 17 | ## Context |
| 18 | |
| 19 | The Chromium project utilizes release block labels to help define each |
| 20 | milestone's critical path. The following labels are available: |
| 21 | |
| 22 | * **ReleaseBlock-Dev**, which blocks shipping to the dev, beta, and stable |
| 23 | channels. |
| 24 | * **ReleaseBlock-Beta**, which blocks shipping to the beta and stable |
| 25 | channels. |
| 26 | * **ReleaseBlock-Stable**, which blocks shipping to the stable channel. |
| 27 | |
| 28 | Release block labels must be used in conjunction with a milestone (M=##) label |
| 29 | as well as an OS (e.g. OS-Android). The combination of blocking label, |
| 30 | milestone, and OS determine which releases we hold; e.g. OS=Android M=59 |
| 31 | ReleaseBlock-Beta means we will not ship M59 Android builds to either the beta |
| 32 | or stable channel until the bug is addressed. Android M59 dev releases, or |
| 33 | releases on any other platform, would not be blocked. |
| 34 | |
| 35 | Because these labels are used to manage critical path, they should **not** be |
| 36 | used unless we genuinely believe we cannot ship to a given channel with the |
| 37 | issue present. Do not mark an issue as a blocker "to ensure someone looks at |
| 38 | it." Priority and milestone labels should be used for this purpose. |
| 39 | |
| 40 | These rules apply to bugs and regressions; teams may use criteria for blocking |
| 41 | due to other work (e.g. tasks) as they see fit. |
| 42 | |
| 43 | ## Assessing Blockers |
| 44 | |
| 45 | Issues should be evaluated for release blocking status using the following |
| 46 | matrix based on the issue's severity and prevalence: |
| 47 | |
| 48 | | | Low Impact | Medium | High | Critical | |
| 49 | |---------------|---------------------|---------------------|---------------------|------------------| |
| 50 | | **Few Users** | | | ReleaseBlock-Stable | ReleaseBlock-Dev | |
| 51 | | **Some** | | ReleaseBlock-Stable | ReleaseBlock-Beta | ReleaseBlock-Dev | |
| 52 | | **Most** | ReleaseBlock-Stable | ReleaseBlock-Beta | ReleaseBlock-Beta | ReleaseBlock-Dev | |
| 53 | | **All** | ReleaseBlock-Stable | ReleaseBlock-Beta | ReleaseBlock-Dev | ReleaseBlock-Dev | |
| 54 | |
| 55 | ### Severity |
| 56 | |
| 57 | Severity is defined as the impact to a user who experiences the bug. |
| 58 | |
| 59 | * **Critical**: A bug with extreme consequence to the user, e.g. a regression |
| 60 | in privacy (leaking user data), loss of user data, crash on startup, etc. |
| 61 | These bugs must be fixed immediately and thus should block any release where |
| 62 | they are present. |
| 63 | * **High**: A bug with large impact to the user, e.g. a CSS rendering issue |
| 64 | causing content to disappear, videos not playing, extreme jank, etc. There |
| 65 | is no simple workaround for the issue. |
| 66 | * **Medium**: A bug with moderate impact to the user, e.g. a CSS rendering |
| 67 | issue causing content to be misaligned, moderate jank, non-startup crash, |
| 68 | memory regressions, etc. There may be a workaround for the issue. |
| 69 | * **Low**: A bug with little impact to the user, generally cosmetic in nature |
| 70 | and easy to work around. |
| 71 | |
| 72 | ### Prevalence |
| 73 | |
| 74 | Prevalence is defined as the volume of users who will experience the bug. |
| 75 | * **Few Users (<5%)**: The bug requires many steps to trigger, or is dependent |
| 76 | on timing, e.g. two simultaneous taps on different parts of the screen. |
| 77 | * **Some (5% - 35%)**: The bug affects a minor workflow, or requires a series |
| 78 | of steps to trigger. |
| 79 | * **Most (35% - 75%)**: The bug affects a major workflow, e.g. sync, |
| 80 | downloading files, etc. |
| 81 | * **All (75% - 100%)**: The bug affects core product functionality, e.g. |
| 82 | scrolling a page. |
| 83 | |
| 84 | Note that prevalence should be evaluated based on the population of users they |
| 85 | affect - e.g. a bug affecting all Android users (but not Windows users) |
| 86 | would still be considered to affect all users, and a bug affecting all |
| 87 | Enterprise Windows users (but not all consumer Windows users) could also be |
| 88 | considered to affect all users. |
| 89 | |
| Alexandre Elias | 826bf8c | 2017-07-24 21:28:51 | [diff] [blame] | 90 | ### Assessing uncertainty |
| 91 | |
| 92 | In practice, the data available for assessing severity and prevalence of bugs is |
| 93 | usually imperfect, so best judgement and rules of thumb need to be employed. |
| 94 | |
| 95 | Engineers assessing bugs can use their knowledge of the underlying system to |
| 96 | intuit whether the observed symptom might be the "tip of an iceberg" of a wider |
| 97 | bug which might have much wider severity and prevalence. The evaluation isn't |
| 98 | required to be limited to the so-far exactly observed symptoms, but should also |
| 99 | be biased upward on the basis of well-founded fears. For example, scary race |
| 100 | conditions or symptoms that indicate a core system with many dependencies is |
| 101 | being undermined. |
| 102 | |
| 103 | A rule of thumb is that such scary "iceberg" problems are more likely for |
| 104 | changes which have not yet been exposed to a large population of users -- |
| 105 | especially, bugs in Dev channel, or bugs in Beta channel affecting a relatively |
| 106 | small or quiet population (for example, Android WebView has a tiny beta |
| 107 | population, and non-English-speaking users have more difficulty getting their |
| 108 | feedback heard). On the other hand, if a bug already has been present on 100% |
| 109 | of stable channel for weeks or months before it was first noticed, that's |
| 110 | evidence that the problem is not so scary or urgent after all. Therefore, |
| 111 | recent regressions should have an upward bias in severity/prevalence assessment, |
| 112 | while nonrecent ones should have a downward one. |
| 113 | |
| 114 | If this sort of consideration is a factor, that should be explicitly mentioned |
| 115 | in the bug update. |
| 116 | |
| Alex Mineer | 2be39f7 | 2017-06-13 16:21:30 | [diff] [blame] | 117 | ### Customization |
| 118 | |
| 119 | The definitions provided above are examples; teams are encouraged to customize |
| 120 | where it makes sense, e.g. the web platform team may consider developer impact |
| 121 | for severity and feature usage for prevalence. |
| 122 | |
| 123 | ## Blocker Management |
| 124 | |
| 125 | **Everyone should feel free to add, modify, or remove release blocking labels |
| 126 | where appropriate, so long as you follow the guidelines below.** If a TPM or |
| 127 | test engineer has marked a bug as a release blocker, but a developer knows for |
| 128 | sure that the issue should not block the release, the developer should remove |
| 129 | the release blocking label; similarly anyone should feel free to add a release |
| 130 | blocking label to a bug they feel warrants holding a release. That said, there |
| 131 | are some general guidelines to follow: |
| 132 | |
| 133 | * Be specific and descriptive in your comments when tagging, or untagging, an |
| 134 | issue as a release blocker. **You must explain your reason for doing so.** |
| 135 | Including your rationale around impact and prevalence will make it much |
| 136 | easier for anyone reviewing the bug to understand why the bug should (or |
| 137 | should not) block the release. It will also help anyone re-assessing the |
| 138 | bug if we receive new information later. |
| 139 | * **When in doubt, be conservative and mark bugs as blockers!** It's better |
| 140 | to tag a bug as a release blocking issue and have the label removed later |
| 141 | than to ship the bug to users and have to respin due to unanticipated |
| 142 | consequences. You can always loop in the release management team (by CC'ing |
| 143 | onto the bug, pinging, or e-mailing) the platform owners listed on the |
| 144 | [Chrome Calendar](https://chromepmo.appspot.com/calendar) (Googlers only, |
| 145 | opening to all in the near future) for their input if you need assistance. |
| 146 | * These guidelines should be used with the data we have available at the time. |
| 147 | **If we need more data to make a good decision, get help in finding it!** |
| 148 | When new information arises, re-assess the bug using the new details as soon |
| 149 | as possible. |
| 150 | * The release management team (release engineers and TPMs) have the final say |
| 151 | when it comes to release blocking issues - they can tag, and untag, issues |
| 152 | as they see fit. |
| 153 | |
| 154 | ## Other Considerations |
| 155 | |
| 156 | ### New Features |
| 157 | |
| 158 | Any bugs related to a feature that is behind a flag and is not enabled by |
| 159 | default should never block a release (they should be disabled instead). |
| 160 | Features enabled by default should follow the proposals listed above. |
| 161 | |
| 162 | ### Regressions |
| 163 | |
| 164 | Regressions should follow the same guidelines as listed above; an issue should |
| 165 | not be tagged as a release blocker simply because it is a regression. While |
| 166 | we'd like to prevent regressions in general, there is a large backlog of bugs we |
| 167 | need to address, and we should focus on the most important. To ensure we |
| 168 | maintain a high bar for product quality, we should track the number of |
| 169 | introduced versus escaped regressions, and follow up if the number starts to |
| 170 | rise. |
| Alexandre Elias | 826bf8c | 2017-07-24 21:28:51 | [diff] [blame] | 171 | |
| 172 | In practice, it is still expected that the majority of release blockers filed |
| 173 | will be recent regressions, because on average they have higher severity, |
| 174 | prevalence and uncertainty than longstanding bugs. |