Skip to content

Fix an assertion failure when waiting for recovery [release-7.1]#11398

Merged
jzhou77 merged 1 commit into
apple:release-7.1from
jzhou77:release-7.1
May 14, 2024
Merged

Fix an assertion failure when waiting for recovery [release-7.1]#11398
jzhou77 merged 1 commit into
apple:release-7.1from
jzhou77:release-7.1

Conversation

@jzhou77
Copy link
Copy Markdown
Contributor

@jzhou77 jzhou77 commented May 14, 2024

cherrypick #11399

CC's checkBetterSingletons() calls getUsedIds() that asserts proxy interfaces are present. However, when a GRV/commit proxy failed, before CC starts a new recovery, the proxy's processId becomes empty, thus triggering the failure.

The fix is to cancel the caller while waiting for recovery.

To reproduce 7.1 commit 725a08a clang build:

./fdbserver.6.0.15 -r simulation -f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-1.txt -s 900000399 -b on
-f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-2.txt --restarting -s 900000400 -b on

correctness: 20240514-184616-jzhou-8376a3b76500b2be

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

CC's checkBetterSingletons() calls getUsedIds() that asserts proxy interfaces
are present. However, when a GRV/commit proxy failed, before CC starts a new
recovery, the proxy's processId becomes empty, thus triggering the failure.

The fix is to cancel the caller while waiting for recovery.

To reproduce 7.1 commit 725a08a clang build:

./fdbserver.6.0.15 -r simulation -f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-1.txt -s 900000399 -b on
-f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-2.txt --restarting -s 900000400 -b on
@jzhou77 jzhou77 requested review from dlambrig, kakaiu and xy-54321 May 14, 2024 19:44
@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 923ca8e
  • Duration 0:27:39
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 923ca8e
  • Duration 0:36:00
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 923ca8e
  • Duration 0:55:15
  • Result: ❌ FAILED
  • Error: Error while executing command: docker build --label "org.foundationdb.version=${FDB_VERSION}" --label "org.foundationdb.build_date=${BUILD_DATE}" --label "org.foundationdb.commit=${COMMIT_SHA}" --progress plain --build-arg FDB_VERSION="${FDB_VERSION}" --build-arg FDB_LIBRARY_VERSIONS="${FDB_VERSION}" --build-arg FDB_WEBSITE="${FDB_WEBSITE}" --tag foundationdb/ycsb:${FDB_VERSION}-${COMMIT_SHA}-debug --file Dockerfile.eks --target ycsb .. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 923ca8e
  • Duration 1:12:04
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Copy Markdown
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 923ca8e
  • Duration 1:24:53
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@jzhou77 jzhou77 merged commit f9757d9 into apple:release-7.1 May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants