fix(agent): follow-up review fixes for tinyagents parity (#4451–#4469)#4504
Conversation
…i#4451-tinyhumansai#4469) Addresses the 5 review-bot findings left open when tinyhumansai#4495 merged: - observability: preserve progress-event ordering under backpressure via a single ordered overflow forwarder (was per-event spawned send() that could land a ToolCallCompleted before its Started, leaving a tool row stuck "running") (tinyhumansai#4466). - observability: ModelCallCompleted now emits the carried cache-creation / reasoning token breakdown instead of the crate usage (0 for some providers) so per-call/Langfuse telemetry matches the cost tracker (tinyhumansai#4467). - routes/mod: fallback route ProviderModels share the bridge's provider usage carry, so a successful fallback call reports charged-USD/context/cache/ reasoning instead of a catalogue estimate (tinyhumansai#4467). - update_memory_md: add an fs2 cross-process advisory flock around the read-modify-write so cron subprocesses can't clobber MEMORY.md (the in-process mutex only serialized one process) (tinyhumansai#4458). - frontend: plumb the structured `failure` from ChatSubagentToolResultEvent into the subagent tool-call row so a failed child keeps its why/next copy live instead of losing it until a snapshot reload (tinyhumansai#4459). Claude-Session: https://claude.ai/code/session_019j5TLsRLHsM3kqAYFyH4hR
|
Warning Review limit reachedYou’ve reached a temporary PR review limit under our Fair Usage Limits Policy. Next review available in: 3 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 812a507b3c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…4459) Asserts a failed subagent_tool_result stores the parsed structured failure (class/category/causePlain/nextAction) on the child tool-call row. Claude-Session: https://claude.ai/code/session_019j5TLsRLHsM3kqAYFyH4hR
…tinyhumansai#4459) The live path renders subagent.transcript (not the fallback toolCalls list) once a child has emitted a tool call, so resolveSubagentTranscriptTool must also carry the parsed failure — otherwise a failed child tool lost its why/next until a snapshot reload. Adds failure to the runtime + persisted transcript tool item, the resolver payload/reducer, and hydration; extends the test to assert the transcript path. Claude-Session: https://claude.ai/code/session_019j5TLsRLHsM3kqAYFyH4hR
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 718032968a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (result != null) item.result = result; | ||
| // Carry the structured why/next onto the rendered transcript item; a | ||
| // successful result clears any stale failure (#4459). | ||
| item.failure = success ? undefined : failure; |
There was a problem hiding this comment.
Render subagent failures instead of only storing them
When a sub-agent child tool fails in the normal transcript-present path, this only stores failure on the transcript item, but the renderers for those items (ToolTimelineBlock inline rows and SubagentDrawer rows) still display only the tool name/status/detail/timing and never read the new field. In that scenario the UI still shows just “Failed” and the why/next copy remains invisible, so this plumbing needs to be connected to the subagent row renderers as well.
Useful? React with 👍 / 👎.
| let file = std::fs::OpenOptions::new() | ||
| .create(true) | ||
| .write(true) | ||
| .truncate(false) | ||
| .open(&lock_path) |
There was a problem hiding this comment.
Reject symlinked workspace lock files
If a workspace already contains .memory-write.lock as a symlink (including a dangling symlink), this new OpenOptions call follows it with create(true).write(true) before any containment check. That lets a project-controlled path make update_memory_md create/open/lock a file outside the workspace, bypassing the symlink hardening applied to MEMORY.md/SKILL.md; the lock file should be opened with no-follow semantics or prevalidated as a regular file inside the canonical workspace.
Useful? React with 👍 / 👎.
Follow-up to #4495 — addresses the 5 review-bot findings that were left unresolved when it merged. All seam-side fixes on top of current
main.Fixes
send()task per event, so a latertry_sendcould land aToolCallCompletedbefore itsToolCallStartedand leave a tool row stuck "running". Now a single ordered overflow forwarder drains spilled events FIFO; once anything spills, everything queues until drained.ModelCallCompletednow emits the carried cache-creation/reasoning token breakdown (the crateusage.*is 0 for some providers) so per-call + Langfuse telemetry matches the cost tracker.build_route_modelsnow shares the bridge'sProviderUsageCarry, so a successful fallback-route call reports backend-charged USD / context / cache / reasoning instead of a catalogue estimate.fs2exclusiveflockaround the read-modify-write; the in-process mutex only serialized one process, so cron subprocesses could still clobber the file.failurefromChatSubagentToolResultEventinto the child tool-call row so a failed sub-agent tool keeps its why/next copy live instead of losing it until a snapshot reload.Testing
cargobuilds. Tests for the changed lines are being added in follow-up commits on this branch.Submission Checklist
Refs #4451–#4469 (tracker #4470).
https://claude.ai/code/session_019j5TLsRLHsM3kqAYFyH4hR