Skip to content

[aw-failures] Two daily workflows fail at agent start: Documentation Healer (effort-param 400) & Model Inventory Checker (BYOK auth) #37039

@github-actions

Description

@github-actions

Recommendation

Two daily scheduled workflows fail at agent start, before any work, due to engine/model configuration — fix both: drop the unsupported effort parameter for Documentation Healer, and restore BYOK auth token plumbing for Model Inventory Checker. Neither is a network/firewall issue (audit-diff shows 0 new domains, 0 anomalies vs the last green run).

This sub-issue root-causes the shallow auto-notifier issues #37010 and #37014.

Cluster A — Daily Documentation Healer: effort parameter rejected (fresh regression)

Fix: remove or guard the effort parameter for the Claude small-agent model (or select a model that supports effort).

  • Affected run: 26986947133 (schedule, 2026-06-05T00:01Z) · notifier issue [aw] Daily Documentation Healer failed #37010
  • Regression signal: 7 consecutive prior days success, first failure todayfailure success success success success success success success.
  • Dominant error (all 4 attempts, ~0s each):
API Error: 400 This model does not support the effort parameter.
  • Engine = Claude Code, model = small-agent. The --continue retry path then hit Error: No deferred tool marker found in the resumed session (isNoDeferredMarkerError=true) and --continue was disabled permanently. Agent produced 0 real turns / 0 tokens.
  • audit-diff (base success [26921451398] vs failure 26986947133): new_domain_count=0, anomaly_count=0, run2_turns=1 — regression is isolated to the agent engine invocation, not networking.
  • Failure class: config-error.

Cluster B — Daily Model Inventory Checker: BYOK auth missing (persistent regression)

Fix: restore BYOK token plumbing so the Copilot SDK driver receives valid auth (COPILOT_GITHUB_TOKEN / GH_TOKEN / GITHUB_TOKEN present in the agent env).

[Error: Execution failed: Error: Session was not created with authentication info or custom provider]
  • The Copilot SDK BYOK driver sample (.github/drivers/copilot_sdk_driver_sample_node.cjs) threw an uncaught promise rejection on attempt 1. Harness flagged isAuthError=true: "no authentication information found — not retrying (COPILOT_GITHUB_TOKEN, GH_TOKEN, and GITHUB_TOKEN are all absent or invalid)". The entrypoint unset COPILOT_GITHUB_TOKEN and COPILOT_PROVIDER_API_KEY just before the driver spawned.
  • All upstream collect_* setup jobs succeeded; failure isolated to the agent driver; detection/safe_outputs skipped. 0 tokens.
  • Failure class: config-error.

Affected workflows and run IDs

Cluster Workflow Run Notifier issue Failure class
A Daily Documentation Healer 26986947133 #37010 config-error (effort param)
B Daily Model Inventory Checker 26987484745 #37014 config-error (BYOK auth)

Success criteria / verification

  • Cluster A: Documentation Healer agent phase executes (>0 turns) and the run completes success; no 400 ... effort parameter error.
  • Cluster B: Model Inventory Checker driver creates a session; no Session was not created with authentication info error; auth tokens present in the agent env.
  • Both workflows green for at least 2 consecutive scheduled runs.

Parent: #37005 · Root-causes #37010, #37014 · Analyzed run IDs: 26986947133, 26987484745, 26921451398 · Window: last 6h ending ~2026-06-05T01:34Z.
Related to #37005

Generated by 🔍 [aw] Failure Investigator (6h) · opus48 20.2M · 1.3K AIC ·

  • expires on Jun 12, 2026, 2:35 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions