FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec#224
FE-864: Orchestrator improvements umbrella — brownfield feature delivery from spec#224kostandinang wants to merge 10 commits into
Conversation
PR SummaryMedium Risk Overview The live cook presenter gains an upfront epic→slice grid ( Cook CLI rejects unknown flags, raises agent timeout to 600s, and throws (instead of Default LLM moves to Reviewed by Cursor Bugbot for commit 3e02bfb. Bugbot is set up for automated code reviews on this repo. Configure here. |
90bb5ef to
40a9d88
Compare
5b145b2 to
7d39fe7
Compare
a5bfc10 to
ac4e47c
Compare
40a9d88 to
05b471a
Compare
Each cook agent action (write-tests, write-code, verify-epic) runs under a per-action wall-clock budget enforced in pi-actions.ts. Raise the default from 300s to 600s so Sonnet agents have headroom on larger slices and on brownfield repos where setup/discovery eats into the turn. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…eam, clean failures Iterating on the live TUI from real-terminal feedback: - One global run timer in the footer instead of a per-item clock on every pending row (and whole-second, no jittery decimals). - "brunch" wordmark is now a big lowercase figlet (Slant) in a warm orange gradient, replacing the egg. - Activity log + wordmark stream through Ink <Static> so the full run lands in scrollback instead of collapsing in a redrawn bounded box; line cap removed. - Brigade tracker no longer lights "taste" mid-cook — per-slice verify actions fire during cooking, so taste stays unlit until a real end-of-cook signal. - Failures throw instead of process.exit, so withCookBus disposes (unmounts Ink) before the error prints — no more frozen "prep ◐" hang. cook validates args before mounting the TUI and rejects unknown flags (e.g. --spec-id). check + presenter/cook/pi-actions tests green; full build deferred (active graphite stack navigation). Co-Authored-By: Claude <noreply@anthropic.com>
Wires the two remaining kitchen-brigade phases faithfully to the orchestrator-arcs mapping (verify→taste, ship→serve): - taste lights on the epic-verification verdict (action `epic <id> → …`), not on per-slice `verify <target>` lines — those fire mid-cook and previously lit taste while still cooking. - serve lights on a new `cook-done` event emitted at the end of runCook (after promotion); a halted run never ships, so it never lights serve. phase.test covers both signals + the full prep→cook→taste→plate→serve walk; check + presenter/cook tests green. Co-Authored-By: Claude <noreply@anthropic.com>
Each cook pi session was a black box in the pending panel — just a KB count.
runPi already subscribes to the session's text stream; instead of bytes,
surface the agent's latest non-empty line (tail-truncated, throttled every
2 KB) as the activity-progress detail, so a wait reads as live work ("agent
writing tests · …adds the RefreshToken guard") rather than "still going".
Kept headless createAgentSession — no pi InteractiveMode, no new pi API: pi's
tool-call events come via an extension hook (on('tool_call')), not the
subscribe stream, so a richer "editing <file> / running <tool>" heartbeat is a
separate follow-up that needs the extension-registration path verified.
check + pi-actions/presenter tests green.
Co-Authored-By: Claude <noreply@anthropic.com>
Richer "what the agent is doing" in the pending panel (the spike's Option A, full tier): instead of only the agent's latest line, show the tool calls — "edit src/auth/token.ts", "bash bun test", "grep RefreshToken". pi exposes no tool-call hook on session.subscribe (text/lifecycle only), so buildSessionOptions now supplies the built-in tools itself via customTools + noTools:'builtin': each createXToolDefinition(cwd) is wrapped to emit a label from its params, then delegates unchanged. The builders bake in the real config (withFileMutationQueue, truncation defaults), so behavior is preserved — confirmed in pi's edit.js. Observation is fail-safe (emit in try/catch). toolLabel + instrumentToolDefinition are pure/unit-tested (label mapping; wrap delegates same args + result; observer error can't break a tool call). Caveat: the customTools/noTools runtime wiring isn't covered by tests (they stub createSession, bypassing buildSessionOptions) — needs a real cook run to confirm the agent receives the instrumented tools and they emit live. check + pi-actions tests green. Co-Authored-By: Claude <noreply@anthropic.com>
Brownfield cook provisioned every slice's git worktree eagerly in wireHandlers — N `git worktree add` + N recursive node_modules CoW copies paid synchronously at startup before any slice fired. - Move slice-worktree creation into resolveSliceCwd via idempotent ensureSliceWorktree, so a slice's worktree is materialized on first fire. A run touching 2 of 8 slices pays for 2 worktrees, not 8. Synchronous provisioning serializes concurrent fires on the JS thread, so parallel-policy worktree adds never overlap. - Symlink each slice's node_modules to the parent worktree's single copy instead of CoW-copying per slice (SHAREABLE_TOP_LEVEL_ENTRIES). walkFiles already skips symlinks, so the shared tree is never re-walked during dep seeding, merge, or promotion. Other gitignored dirs still copy per slice. Correctness-neutral: same worktrees/branches, just lazy; deps resolve through the symlink. npm run verify green; adds symlink + idempotency unit tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Replaces the coarse single-phase view with a live grid that reflects cook's actual shape — the highest-value TUI improvement from the review, and it kills the brittle string-matching. - events: run-shape (seeds the grid from the plan, all slices queued) + slice (typed status running|passed|failed + step), emitted from cook-cli + the pi-actions handlers (write-tests/code/evaluate-done) — not string-matched logs. - run-store: a slices grid grouped by epic; slice-keyed activity heartbeat (aligned via runPi activityId = slice id) attaches to the slice's row, so "what the agent is doing" shows inline; non-slice waits stay in the pending footer. - ink: SliceGrid renders epic groups with per-slice status icons + the running slice's step/detail + spinner; replaces the flat pending list for slices. Retry counts deferred (a re-running slice just shows running again; latest wins). Live wiring (run-shape/slice from a real cook) is manual-verify, like the heartbeat. check + presenter/pi-actions/cook tests green (126). Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
05b471a to
89e7850
Compare
ac4e47c to
3e02bfb
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.
| const resolveSliceCwd = (slice: Slice): string => | ||
| sliceLayout === 'shared' | ||
| ? input.sandboxDir | ||
| : seedSliceSandboxFromDeps(input.sandboxDir, plan, slice, { preserveExisting: true }); |
There was a problem hiding this comment.
Grid stale during run-tests
Medium Severity
After write-code, the slice grid keeps the code step while the net’s deferred run-tests transition runs verification. Slice progress events were added only in pi-actions, not where mechanical runVerification runs, so the TUI misstates what the slice is doing until evaluate-done fires.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.
| // clear the live heartbeat once the slice stops running | ||
| ...(running ? {} : { detail: undefined }), | ||
| }), | ||
| }); |
There was a problem hiding this comment.
Passed slices keep step text
Low Severity
When a slice moves to passed or failed, RunStore clears detail but leaves step set. The Ink grid still appends the old sub-action (e.g. verify) next to a checkmark, implying work is in progress after the slice finished.
Reviewed by Cursor Bugbot for commit 3e02bfb. Configure here.



Stack Context
Stacks on FE-878 and stays under the FE-864 brownfield orchestration umbrella. This PR is no longer just the timeout tweak; it collects the operational improvements needed to make
brunch serveusable on real brownfield runs.What?
node_modulesinstead of copying it into every slice.claude-opus-4-6so plan/cook/chat paths do not fall back on retired or weaker defaults.Why?
The spec 23 run exposed this as a broader reliability issue, not a single timeout problem: eager worktree seeding copied nearly a gigabyte of
node_modulesper slice before execution could start, and the architect model default caused a fallback plan that amplified the slice count. This PR makes the FE-864 branch an umbrella for those concrete orchestration improvements.