Skip to content

Harden API consumption log collection against continuation timeouts#41676

Merged
pelikhan merged 2 commits into
mainfrom
copilot/deep-report-add-resilience-to-api-log-collection
Jun 26, 2026
Merged

Harden API consumption log collection against continuation timeouts#41676
pelikhan merged 2 commits into
mainfrom
copilot/deep-report-add-resilience-to-api-log-collection

Conversation

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

The API consumption report could publish a partial 24h window when a continuation call timed out mid-collection, dropping already-fetched coverage and skewing day-over-day comparisons. This change hardens the workflow’s log-collection instructions so continuation failures degrade to a preserved partial dataset instead of silently losing progress.

  • Collection resilience

    • Treat /tmp/gh-aw/aw-mcp/logs/ as the authoritative collected dataset across pagination calls.
    • Keep already-downloaded run directories when a later continuation call fails, instead of treating the failed page as an all-or-nothing boundary.
  • Bounded continuation retries

    • Retry the same continuation request up to 2 additional times on transient failures (timeout, ECONNREFUSED, transport/tool errors).
    • Require short backoff between retries without mutating returned pagination parameters.
  • Pagination discipline

    • Re-read the returned continuation field after each successful follow-up call.
    • Continue to forbid inventing before_run_id or using ad hoc pagination/count tuning outside the MCP response.
  • Report clarity on degraded runs

    • Instruct the workflow to explicitly note when continuation retries were exhausted and the final dataset may still be partial.
- Treat the run directories already present under `/tmp/gh-aw/aw-mcp/logs/` as the authoritative collected dataset.
- If a continuation call times out ... retry that same continuation call up to **2** more times with short backoff.
- If the continuation call still fails after those bounded retries, stop collecting and proceed with the logs already downloaded.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add resilience to GitHub API consumption log collection Harden API consumption log collection against continuation timeouts Jun 26, 2026
Copilot AI requested a review from pelikhan June 26, 2026 12:22
@pelikhan pelikhan marked this pull request as ready for review June 26, 2026 12:24
Copilot AI review requested due to automatic review settings June 26, 2026 12:24
@pelikhan pelikhan merged commit 0fd971a into main Jun 26, 2026
@pelikhan pelikhan deleted the copilot/deep-report-add-resilience-to-api-log-collection branch June 26, 2026 12:24
Copilot stopped reviewing on behalf of pelikhan due to an error June 26, 2026 12:24

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the API consumption report workflow instructions to make log pagination/continuation more resilient and to preserve partial results when continuation fails.

Changes:

  • Treat on-disk /tmp/gh-aw/aw-mcp/logs/ directories as the authoritative collected dataset during pagination.
  • Add bounded retries with backoff for transient continuation failures and clarify how to proceed with partial data.
  • Regenerate the workflow lock/metadata to reflect the updated markdown content.
Show a summary per file
File Description
.github/workflows/api-consumption-report.md Refines log collection/pagination guidance (authoritative on-disk dataset + bounded retries).
.github/workflows/api-consumption-report.lock.yml Updates generated metadata hashes corresponding to the markdown changes.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment on lines +72 to +73
- If `continuation` is present, make at most **2** additional continuation calls using the returned parameters. After each successful continuation call, re-read that response's `continuation` field before deciding whether to continue. Do **not** invent your own `before_run_id` from the earliest run in the batch.
- If a continuation call times out, returns `ECONNREFUSED`, or otherwise fails with a transient tool/transport error, retry that **same** continuation call up to **2** more times with short backoff (about 15s, then about 45s). Do **not** change the returned pagination parameters, timeout, or count while retrying.
Comment on lines +71 to +74
- Treat the run directories already present under `/tmp/gh-aw/aw-mcp/logs/` as the authoritative collected dataset. Successful continuation calls should add to that dataset; if a later continuation attempt fails, keep using the directories that are already on disk.
- If `continuation` is present, make at most **2** additional continuation calls using the returned parameters. After each successful continuation call, re-read that response's `continuation` field before deciding whether to continue. Do **not** invent your own `before_run_id` from the earliest run in the batch.
- If a continuation call times out, returns `ECONNREFUSED`, or otherwise fails with a transient tool/transport error, retry that **same** continuation call up to **2** more times with short backoff (about 15s, then about 45s). Do **not** change the returned pagination parameters, timeout, or count while retrying.
- If the continuation call still fails after those bounded retries, stop collecting, proceed with the logs already downloaded to `/tmp/gh-aw/aw-mcp/logs/`, and clearly note in the final discussion that continuation retries were exhausted and the dataset may be partial.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants