Skip to content

FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec#224

Open
kostandinang wants to merge 10 commits into
ka/fe-878-brunch-servefrom
ka/fe-864-pi-timeout-600s
Open

FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec#224
kostandinang wants to merge 10 commits into
ka/fe-878-brunch-servefrom
ka/fe-864-pi-timeout-600s

Conversation

@kostandinang

@kostandinang kostandinang commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Stack Context

Stacks on FE-878 and stays under the FE-864 brownfield orchestration umbrella. This PR is no longer just the timeout tweak; it collects the operational improvements needed to make brunch serve usable on real brownfield runs.

What?

  • Raises the cook agent action timeout to 600s.
  • Improves the live serve/cook heartbeat and completion signals so long runs show useful progress.
  • Lazily provisions per-slice cook worktrees and shares node_modules instead of copying it into every slice.
  • Moves orchestration defaults to claude-opus-4-6 so plan/cook/chat paths do not fall back on retired or weaker defaults.

Why?

The spec 23 run exposed this as a broader reliability issue, not a single timeout problem: eager worktree seeding copied nearly a gigabyte of node_modules per slice before execution could start, and the architect model default caused a fallback plan that amplified the slice count. This PR makes the FE-864 branch an umbrella for those concrete orchestration improvements.

@cursor

cursor Bot commented Jun 16, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes brownfield sandbox provisioning and shared node_modules semantics (possible cache contention across parallel slices) plus widespread model defaults; TUI and error-path behavior affect every cook/serve run.

Overview
Improves brownfield brunch serve / cook runs that were slow or opaque on real repos: slice sandboxes are provisioned lazily when a transition fires (only touched slices pay for worktrees), and node_modules is symlinked from the parent worktree instead of CoW-copied per slice. ensureSliceWorktree makes repeat fires and rework idempotent.

The live cook presenter gains an upfront epic→slice grid (run-shape, slice events), tool-call heartbeats (instrumented pi tools + latest agent line instead of raw KB counts), a cook-done signal for the brigade serve phase, and Ink changes (Static scrollback, figlet wordmark, global timer). Brigade taste now tracks epic verdict lines, not per-slice verify.

Cook CLI rejects unknown flags, raises agent timeout to 600s, and throws (instead of process.exit) on plan/sandbox/Petrinaut preflight so withCookBus can unmount Ink before errors print. Entry points parse/validate args before mounting the TUI where applicable.

Default LLM moves to claude-opus-4-6 on cook pi-actions, plan architect, interviewer, and secondary chat (docs updated).

Reviewed by Cursor Bugbot for commit 3e02bfb. Bugbot is set up for automated code reviews on this repo. Configure here.

@kostandinang kostandinang changed the title FE-864: raise pi action timeout to 600s FE-864: Raise pi-action timeout to 600s Jun 16, 2026
@kostandinang kostandinang force-pushed the ka/fe-878-brunch-serve branch from 90bb5ef to 40a9d88 Compare June 16, 2026 18:05
@kostandinang kostandinang force-pushed the ka/fe-864-pi-timeout-600s branch from 5b145b2 to 7d39fe7 Compare June 16, 2026 18:05
@kostandinang kostandinang changed the title FE-864: Raise pi-action timeout to 600s FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec Jun 16, 2026
Comment thread src/orchestrator/src/epic-sandbox-merge.ts
Comment thread src/orchestrator/src/epic-sandbox-merge.ts
@kostandinang kostandinang force-pushed the ka/fe-864-pi-timeout-600s branch from a5bfc10 to ac4e47c Compare June 16, 2026 23:45
@kostandinang kostandinang force-pushed the ka/fe-878-brunch-serve branch from 40a9d88 to 05b471a Compare June 16, 2026 23:45
Comment thread src/orchestrator/src/pi-actions.ts Outdated
kostandinang and others added 10 commits June 17, 2026 00:52
Each cook agent action (write-tests, write-code, verify-epic) runs under a
per-action wall-clock budget enforced in pi-actions.ts. Raise the default
from 300s to 600s so Sonnet agents have headroom on larger slices and on
brownfield repos where setup/discovery eats into the turn.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…eam, clean failures

Iterating on the live TUI from real-terminal feedback:

- One global run timer in the footer instead of a per-item clock on every
  pending row (and whole-second, no jittery decimals).
- "brunch" wordmark is now a big lowercase figlet (Slant) in a warm orange
  gradient, replacing the egg.
- Activity log + wordmark stream through Ink <Static> so the full run lands in
  scrollback instead of collapsing in a redrawn bounded box; line cap removed.
- Brigade tracker no longer lights "taste" mid-cook — per-slice verify actions
  fire during cooking, so taste stays unlit until a real end-of-cook signal.
- Failures throw instead of process.exit, so withCookBus disposes (unmounts
  Ink) before the error prints — no more frozen "prep ◐" hang. cook validates
  args before mounting the TUI and rejects unknown flags (e.g. --spec-id).

check + presenter/cook/pi-actions tests green; full build deferred (active
graphite stack navigation).

Co-Authored-By: Claude <noreply@anthropic.com>
Wires the two remaining kitchen-brigade phases faithfully to the
orchestrator-arcs mapping (verify→taste, ship→serve):

- taste lights on the epic-verification verdict (action `epic <id> → …`), not
  on per-slice `verify <target>` lines — those fire mid-cook and previously lit
  taste while still cooking.
- serve lights on a new `cook-done` event emitted at the end of runCook (after
  promotion); a halted run never ships, so it never lights serve.

phase.test covers both signals + the full prep→cook→taste→plate→serve walk;
check + presenter/cook tests green.

Co-Authored-By: Claude <noreply@anthropic.com>
Each cook pi session was a black box in the pending panel — just a KB count.
runPi already subscribes to the session's text stream; instead of bytes,
surface the agent's latest non-empty line (tail-truncated, throttled every
2 KB) as the activity-progress detail, so a wait reads as live work ("agent
writing tests · …adds the RefreshToken guard") rather than "still going".

Kept headless createAgentSession — no pi InteractiveMode, no new pi API: pi's
tool-call events come via an extension hook (on('tool_call')), not the
subscribe stream, so a richer "editing <file> / running <tool>" heartbeat is a
separate follow-up that needs the extension-registration path verified.

check + pi-actions/presenter tests green.

Co-Authored-By: Claude <noreply@anthropic.com>
Richer "what the agent is doing" in the pending panel (the spike's Option A,
full tier): instead of only the agent's latest line, show the tool calls —
"edit src/auth/token.ts", "bash bun test", "grep RefreshToken".

pi exposes no tool-call hook on session.subscribe (text/lifecycle only), so
buildSessionOptions now supplies the built-in tools itself via customTools +
noTools:'builtin': each createXToolDefinition(cwd) is wrapped to emit a label
from its params, then delegates unchanged. The builders bake in the real
config (withFileMutationQueue, truncation defaults), so behavior is preserved
— confirmed in pi's edit.js. Observation is fail-safe (emit in try/catch).

toolLabel + instrumentToolDefinition are pure/unit-tested (label mapping;
wrap delegates same args + result; observer error can't break a tool call).

Caveat: the customTools/noTools runtime wiring isn't covered by tests (they
stub createSession, bypassing buildSessionOptions) — needs a real cook run to
confirm the agent receives the instrumented tools and they emit live.

check + pi-actions tests green.

Co-Authored-By: Claude <noreply@anthropic.com>
Brownfield cook provisioned every slice's git worktree eagerly in
wireHandlers — N `git worktree add` + N recursive node_modules CoW copies
paid synchronously at startup before any slice fired.

- Move slice-worktree creation into resolveSliceCwd via idempotent
  ensureSliceWorktree, so a slice's worktree is materialized on first fire.
  A run touching 2 of 8 slices pays for 2 worktrees, not 8. Synchronous
  provisioning serializes concurrent fires on the JS thread, so parallel-policy
  worktree adds never overlap.
- Symlink each slice's node_modules to the parent worktree's single copy
  instead of CoW-copying per slice (SHAREABLE_TOP_LEVEL_ENTRIES). walkFiles
  already skips symlinks, so the shared tree is never re-walked during dep
  seeding, merge, or promotion. Other gitignored dirs still copy per slice.

Correctness-neutral: same worktrees/branches, just lazy; deps resolve through
the symlink. npm run verify green; adds symlink + idempotency unit tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Replaces the coarse single-phase view with a live grid that reflects cook's
actual shape — the highest-value TUI improvement from the review, and it kills
the brittle string-matching.

- events: run-shape (seeds the grid from the plan, all slices queued) + slice
  (typed status running|passed|failed + step), emitted from cook-cli + the
  pi-actions handlers (write-tests/code/evaluate-done) — not string-matched logs.
- run-store: a slices grid grouped by epic; slice-keyed activity heartbeat
  (aligned via runPi activityId = slice id) attaches to the slice's row, so
  "what the agent is doing" shows inline; non-slice waits stay in the pending
  footer.
- ink: SliceGrid renders epic groups with per-slice status icons + the running
  slice's step/detail + spinner; replaces the flat pending list for slices.

Retry counts deferred (a re-running slice just shows running again; latest
wins). Live wiring (run-shape/slice from a real cook) is manual-verify, like
the heartbeat. check + presenter/pi-actions/cook tests green (126).

Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@kostandinang kostandinang force-pushed the ka/fe-878-brunch-serve branch from 05b471a to 89e7850 Compare June 16, 2026 23:55
@kostandinang kostandinang force-pushed the ka/fe-864-pi-timeout-600s branch from ac4e47c to 3e02bfb Compare June 16, 2026 23:55

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.

const resolveSliceCwd = (slice: Slice): string =>
sliceLayout === 'shared'
? input.sandboxDir
: seedSliceSandboxFromDeps(input.sandboxDir, plan, slice, { preserveExisting: true });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grid stale during run-tests

Medium Severity

After write-code, the slice grid keeps the code step while the net’s deferred run-tests transition runs verification. Slice progress events were added only in pi-actions, not where mechanical runVerification runs, so the TUI misstates what the slice is doing until evaluate-done fires.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.

// clear the live heartbeat once the slice stops running
...(running ? {} : { detail: undefined }),
}),
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed slices keep step text

Low Severity

When a slice moves to passed or failed, RunStore clears detail but leaves step set. The Ink grid still appends the old sub-action (e.g. verify) next to a checkmark, implying work is in progress after the slice finished.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant