Lock manager: Fix setup bug #230519

flash1293 · 2025-08-05T09:23:28Z

The lock manager runs setupLockManagerIndex via lodash once, so it's not happening on every call. However, if the first call to setupLockManagerIndex errors out (e.g. because Elasticsearch isn't ready yet), then every subsequent call will return the cached rejected promise and fail as well, rendering all lock managers in that node instance broken (since once keeps its state on module scope)

This leads to issues like this (timeout exception thrown from streams, but the call stack originates from the slo plugin setup routine since it's the cached rejected promise):

[2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out
    at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50)
    at processTicksAndRejections (node:internal/process/task_queues:105:5)
    at runNextTicks (node:internal/process/task_queues:69:3)
    at listOnTimeout (node:internal/timers:549:9)
    at processTimers (node:internal/timers:523:7)
    at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32
    at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20)
    at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)
    at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16)
    at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3)
    at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3)
    at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5)
    at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20)
    at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7

This PR fixes the problem by not using once but instead keeping the state manually only if the promise succeeds, passing errors through.

elasticmachine · 2025-08-05T09:24:17Z

Pinging @elastic/obs-knowledge-team (Team:obs-knowledge)

…sh1293/kibana into flash1293/fix-lock-manager-setup-bug

SrdjanLL

LGTM, just left a thought on how this new behaviour may impact downstream.

SrdjanLL · 2025-08-05T11:09:51Z

packages/kbn-lock-manager/src/lock_manager_client.ts

-let runSetupIndexAssetOnce = once(setupLockManagerIndex);
-export function runSetupIndexAssetEveryTime() {
-  runSetupIndexAssetOnce = setupLockManagerIndex;
+export function rerunSetupIndexAsset() {


[Observation] Do you think there are any risks of exposing this functionality to downstream dependencies of the lock manager? Guess it was here before, which was going to cache the failures, but now that it's passing through, is this exposing the "reset" functionality downstream?

It shouldn't change anything, it's just the new version of runSetIndexAssetEveryTime, but with a new name.

kibanamachine · 2025-08-05T11:41:45Z

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/16749023740

elasticmachine · 2025-08-05T11:41:53Z

💚 Build Succeeded

Buildkite Build
Commit: ebf3a3e

Metrics [docs]

✅ unchanged

History

💔 Build #326377 failed 5abe69d

Fixes elastic#230499 The lock manager runs `setupLockManagerIndex` via lodash `once`, so it's not happening on every call. However, if the first call to `setupLockManagerIndex` errors out (e.g. because Elasticsearch isn't ready yet), then every subsequent call will return the cached rejected promise and fail as well, rendering all lock managers in that node instance broken (since `once` keeps its state on module scope) This leads to issues like this (timeout exception thrown from streams, but the call stack originates from the slo plugin setup routine since it's the cached rejected promise): ``` [2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50) at processTicksAndRejections (node:internal/process/task_queues:105:5) at runNextTicks (node:internal/process/task_queues:69:3) at listOnTimeout (node:internal/timers:549:9) at processTimers (node:internal/timers:523:7) at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32 at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20) at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16) at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16) at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3) at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3) at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5) at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20) at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7 ``` This PR fixes the problem by not using once but instead keeping the state manually only if the promise succeeds, passing errors through. (cherry picked from commit b4f8488)

kibanamachine · 2025-08-05T11:48:26Z

💚 All backports created successfully

Status	Branch	Result
✅	9.1

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

# Backport This will backport the following commits from `main` to `9.1`: - [Lock manager: Fix setup bug (#230519)](#230519)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  Co-authored-by: Joe Reuter <[email protected]>

Fixes elastic#230499 The lock manager runs `setupLockManagerIndex` via lodash `once`, so it's not happening on every call. However, if the first call to `setupLockManagerIndex` errors out (e.g. because Elasticsearch isn't ready yet), then every subsequent call will return the cached rejected promise and fail as well, rendering all lock managers in that node instance broken (since `once` keeps its state on module scope) This leads to issues like this (timeout exception thrown from streams, but the call stack originates from the slo plugin setup routine since it's the cached rejected promise): ``` [2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50) at processTicksAndRejections (node:internal/process/task_queues:105:5) at runNextTicks (node:internal/process/task_queues:69:3) at listOnTimeout (node:internal/timers:549:9) at processTimers (node:internal/timers:523:7) at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32 at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20) at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16) at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16) at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3) at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3) at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5) at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20) at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7 ``` This PR fixes the problem by not using once but instead keeping the state manually only if the promise succeeds, passing errors through.

flash1293 · 2025-08-06T06:18:01Z

💚 All backports created successfully

Status	Branch	Result
✅	8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

Fixes elastic#230499 The lock manager runs `setupLockManagerIndex` via lodash `once`, so it's not happening on every call. However, if the first call to `setupLockManagerIndex` errors out (e.g. because Elasticsearch isn't ready yet), then every subsequent call will return the cached rejected promise and fail as well, rendering all lock managers in that node instance broken (since `once` keeps its state on module scope) This leads to issues like this (timeout exception thrown from streams, but the call stack originates from the slo plugin setup routine since it's the cached rejected promise): ``` [2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50) at processTicksAndRejections (node:internal/process/task_queues:105:5) at runNextTicks (node:internal/process/task_queues:69:3) at listOnTimeout (node:internal/timers:549:9) at processTimers (node:internal/timers:523:7) at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32 at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20) at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16) at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16) at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3) at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3) at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5) at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20) at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7 ``` This PR fixes the problem by not using once but instead keeping the state manually only if the promise succeeds, passing errors through. (cherry picked from commit b4f8488)

# Backport This will backport the following commits from `main` to `8.19`: - [Lock manager: Fix setup bug (#230519)](#230519)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)

Fixes elastic#230499 The lock manager runs `setupLockManagerIndex` via lodash `once`, so it's not happening on every call. However, if the first call to `setupLockManagerIndex` errors out (e.g. because Elasticsearch isn't ready yet), then every subsequent call will return the cached rejected promise and fail as well, rendering all lock managers in that node instance broken (since `once` keeps its state on module scope) This leads to issues like this (timeout exception thrown from streams, but the call stack originates from the slo plugin setup routine since it's the cached rejected promise): ``` [2025-08-01T18:36:25.080+00:00][ERROR][plugins.streams] TimeoutError: Request timed out at KibanaTransport._request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:564:50) at processTicksAndRejections (node:internal/process/task_queues:105:5) at runNextTicks (node:internal/process/task_queues:69:3) at listOnTimeout (node:internal/timers:549:9) at processTimers (node:internal/timers:523:7) at /usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:631:32 at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:627:20) at KibanaTransport.request (/usr/share/kibana/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16) at Cluster.putComponentTemplate (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:600:16) at ensureTemplatesAndIndexCreated (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:56:3) at setupLockManagerIndex (/usr/share/kibana/node_modules/@kbn/lock-manager/src/setup_lock_manager_index.js:110:3) at LockManager.acquire (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:53:5) at withLock (/usr/share/kibana/node_modules/@kbn/lock-manager/src/lock_manager_client.js:242:20) at /usr/share/kibana/node_modules/@kbn/slo-plugin/server/plugin.js:176:7 ``` This PR fixes the problem by not using once but instead keeping the state manually only if the promise succeeds, passing errors through.

fix setup bug

a49fcce

flash1293 requested a review from a team as a code owner August 5, 2025 09:23

flash1293 added release_note:fix Team:obs-knowledge Observability Experience Knowledge team backport:version Backport to applied version labels v9.2.0 v9.1.1 labels Aug 5, 2025

flash1293 added 3 commits August 5, 2025 11:24

Merge branch 'main' into flash1293/fix-lock-manager-setup-bug

5abe69d

remove unused import

26668a5

Merge branch 'flash1293/fix-lock-manager-setup-bug' of github.com:fla…

ebf3a3e

…sh1293/kibana into flash1293/fix-lock-manager-setup-bug

arturoliduena approved these changes Aug 5, 2025

View reviewed changes

SrdjanLL approved these changes Aug 5, 2025

View reviewed changes

flash1293 merged commit b4f8488 into elastic:main Aug 5, 2025
12 checks passed

kibanamachine mentioned this pull request Aug 5, 2025

[9.1] Lock manager: Fix setup bug (#230519) #230549

Merged

flash1293 added the v8.19.1 label Aug 6, 2025

flash1293 mentioned this pull request Aug 6, 2025

[8.19] Lock manager: Fix setup bug (#230519) #230702

Merged

wildemat mentioned this pull request Aug 7, 2025

pr 230826 #231022

Closed

10 tasks

mistic added v9.1.2 and removed v9.1.1 v8.19.1 labels Aug 7, 2025

mistic added the v8.19.2 label Aug 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lock manager: Fix setup bug #230519

Lock manager: Fix setup bug #230519

Uh oh!

flash1293 commented Aug 5, 2025 •

edited by kibanamachine

Loading

Uh oh!

elasticmachine commented Aug 5, 2025

Uh oh!

SrdjanLL left a comment

Uh oh!

SrdjanLL Aug 5, 2025

Uh oh!

flash1293 Aug 5, 2025

Uh oh!

Uh oh!

kibanamachine commented Aug 5, 2025

Uh oh!

elasticmachine commented Aug 5, 2025

Uh oh!

kibanamachine commented Aug 5, 2025

Uh oh!

flash1293 commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Lock manager: Fix setup bug #230519

Lock manager: Fix setup bug #230519

Uh oh!

Conversation

flash1293 commented Aug 5, 2025 • edited by kibanamachine Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Aug 5, 2025

Uh oh!

SrdjanLL left a comment

Choose a reason for hiding this comment

Uh oh!

SrdjanLL Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

flash1293 Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kibanamachine commented Aug 5, 2025

Uh oh!

elasticmachine commented Aug 5, 2025

💚 Build Succeeded

Metrics [docs]

History

Uh oh!

kibanamachine commented Aug 5, 2025

💚 All backports created successfully

Questions ?

Uh oh!

flash1293 commented Aug 6, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

flash1293 commented Aug 5, 2025 •

edited by kibanamachine

Loading