feat(chat): render tool-call failure explanations in View-processing timeline (#4254, frontend)#4413
Conversation
📝 WalkthroughWalkthroughThis PR adds tool-failure classification and transport plumbing in Rust, carries the failure payload into the chat runtime, and renders localized “why” and “next” explanations in the conversation UI across all supported locales. ChangesTool failure classification feature
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related issues
Sequence Diagram(s)sequenceDiagram
participant AgentToolExec
participant tool_status
participant progress_bridge
participant ChatRuntimeProvider
participant chatRuntimeSlice
participant ProcessingTranscriptView
AgentToolExec->>tool_status: classify(error_text, timed_out)
tool_status-->>AgentToolExec: ClassifiedFailure
AgentToolExec->>progress_bridge: ToolExecutionCompleted{failure}
progress_bridge->>ChatRuntimeProvider: tool_result{failure JSON}
ChatRuntimeProvider->>chatRuntimeSlice: parseToolFailure(raw)
chatRuntimeSlice-->>ProcessingTranscriptView: ToolTimelineEntry{failure}
ProcessingTranscriptView->>ProcessingTranscriptView: render why/next
Suggested labels: Suggested reviewers: Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e5d2af2821
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| subagent: entry.subagent ? subagentActivityFromPersisted(entry.subagent) : undefined, | ||
| // Carry a persisted failure explanation across the round-trip (#4254). The | ||
| // shared parser tolerates both camelCase (persisted) and snake_case (wire). | ||
| failure: parseToolFailure(entry.failure), |
There was a problem hiding this comment.
Persist tool failures in turn-state snapshots
When a failed tool row is rehydrated from openhuman.threads_turn_state_get after a reload or thread switch, this parser never receives a value because the Rust snapshot type still has no failure field on src/openhuman/threads/turn_state/types.rs::ToolTimelineEntry, and the mirror handles AgentProgress::ToolCallCompleted { call_id, success, .. } without storing the new failure payload. The live socket path shows the explanation, but settled/reloaded turns drop the new “why / what to do next” text; add the failure field to the Rust turn-state row and set/clear it in the mirror on completion.
Useful? React with 👍 / 👎.
|
Fixed the two red lanes: Rust Quality (fmt, clippy) — the failure was Frontend Checks — the failure was Verified locally, all green: |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/channels/providers/web/progress_bridge.rs (1)
411-416: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDuplicate failure-serialization one-liner.
The
failure.as_ref().and_then(|f| serde_json::to_value(f).ok())pattern is duplicated verbatim inagent_tool_exec.rs. Consider extracting a small shared helper (e.g.ClassifiedFailure::to_json(&self) -> serde_json::Valueor a free function) to avoid drift between the two call sites, and optionally log a warning on serialization failure instead of silently dropping it.♻️ Example consolidation
-let failure_json = failure.as_ref().and_then(|f| serde_json::to_value(f).ok()); +let failure_json = failure.as_ref().and_then(|f| { + serde_json::to_value(f) + .map_err(|e| log::warn!("failed to serialize ClassifiedFailure: {e}")) + .ok() +});🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/channels/providers/web/progress_bridge.rs` around lines 411 - 416, The failure-to-JSON conversion logic is duplicated in `progress_bridge` and `agent_tool_exec`, so extract a shared helper on `ClassifiedFailure` or a small free function to centralize `failure.as_ref().and_then(|f| serde_json::to_value(f).ok())`. Update both call sites to use that helper so the serialization behavior stays consistent and doesn’t drift. If serialization still fails, consider logging a warning at the helper boundary instead of silently discarding the error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/src/lib/i18n/zh-CN.ts`:
- Around line 20-21: The blocked-by-policy message uses a mismatched settings
label in the zh-CN i18n entries, so update the
`conversations.toolFailure.blockedByPolicy.next` translation to reference the
existing `settings.agentAccess.title` wording used elsewhere in this file. Keep
the rest of the sentence intact and replace `代理访问` with the same translated
label users will see in the UI (`智能体系统访问`) so the navigation hint matches the
actual settings title.
---
Nitpick comments:
In `@src/openhuman/channels/providers/web/progress_bridge.rs`:
- Around line 411-416: The failure-to-JSON conversion logic is duplicated in
`progress_bridge` and `agent_tool_exec`, so extract a shared helper on
`ClassifiedFailure` or a small free function to centralize
`failure.as_ref().and_then(|f| serde_json::to_value(f).ok())`. Update both call
sites to use that helper so the serialization behavior stays consistent and
doesn’t drift. If serialization still fails, consider logging a warning at the
helper boundary instead of silently discarding the error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e18ca717-4eb6-4a28-a432-c0c443a4804d
📒 Files selected for processing (41)
app/src/lib/i18n/ar.tsapp/src/lib/i18n/bn.tsapp/src/lib/i18n/de.tsapp/src/lib/i18n/en.tsapp/src/lib/i18n/es.tsapp/src/lib/i18n/fr.tsapp/src/lib/i18n/hi.tsapp/src/lib/i18n/id.tsapp/src/lib/i18n/it.tsapp/src/lib/i18n/ko.tsapp/src/lib/i18n/pl.tsapp/src/lib/i18n/pt.tsapp/src/lib/i18n/ru.tsapp/src/lib/i18n/zh-CN.tsapp/src/pages/conversations/components/ProcessingTranscriptView.test.tsxapp/src/pages/conversations/components/ProcessingTranscriptView.tsxapp/src/providers/ChatRuntimeProvider.tsxapp/src/services/chatService.tsapp/src/store/chatRuntimeSlice.toolFailure.test.tsapp/src/store/chatRuntimeSlice.tsapp/src/types/turnState.tssrc/bin/harness_subagent_audit.rssrc/core/event_bus/events.rssrc/core/event_bus/events_tests.rssrc/core/socketio.rssrc/openhuman/agent/dispatcher.rssrc/openhuman/agent/dispatcher_tests.rssrc/openhuman/agent/harness/engine/progress.rssrc/openhuman/agent/harness/session/agent_tool_exec.rssrc/openhuman/agent/harness/tool_result_artifacts/mod.rssrc/openhuman/agent/progress.rssrc/openhuman/agent/progress_tracing/tests.rssrc/openhuman/agent/tests.rssrc/openhuman/channels/providers/web/progress_bridge.rssrc/openhuman/mod.rssrc/openhuman/runtime_node/ops.rssrc/openhuman/threads/turn_state/mirror_tests.rssrc/openhuman/tinyagents/observability.rssrc/openhuman/tool_status/mod.rssrc/openhuman/tool_status/ops.rssrc/openhuman/tool_status/types.rs
| 'conversations.toolFailure.blockedByPolicy.cause': '此操作已被你的安全设置阻止。', | ||
| 'conversations.toolFailure.blockedByPolicy.next': '如果希望执行,请在“设置 → 代理访问”中允许它。', |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Use the existing settings label here.
settings.agentAccess.title is translated elsewhere in this file as 智能体系统访问, so 代理访问 will send users to a label they cannot match in the UI.
♻️ Proposed fix
- 'conversations.toolFailure.blockedByPolicy.next': '如果希望执行,请在“设置 → 代理访问”中允许它。',
+ 'conversations.toolFailure.blockedByPolicy.next': '如果希望执行,请在“设置 → 智能体系统访问”中允许它。',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'conversations.toolFailure.blockedByPolicy.cause': '此操作已被你的安全设置阻止。', | |
| 'conversations.toolFailure.blockedByPolicy.next': '如果希望执行,请在“设置 → 代理访问”中允许它。', | |
| 'conversations.toolFailure.blockedByPolicy.cause': '此操作已被你的安全设置阻止。', | |
| 'conversations.toolFailure.blockedByPolicy.next': '如果希望执行,请在“设置 → 智能体系统访问”中允许它。', |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/lib/i18n/zh-CN.ts` around lines 20 - 21, The blocked-by-policy
message uses a mismatched settings label in the zh-CN i18n entries, so update
the `conversations.toolFailure.blockedByPolicy.next` translation to reference
the existing `settings.agentAccess.title` wording used elsewhere in this file.
Keep the rest of the sentence intact and replace `代理访问` with the same translated
label users will see in the UI (`智能体系统访问`) so the navigation hint matches the
actual settings title.
|
Follow-up: the Rust Quality (clippy) lane was still red after the fmt push — a real compile error the PR introduced, not covered by my formatting commit:
The #4254 change added the required
These are non-tool-failure channel events, so the correct value is Verified locally green: (My earlier local clippy pass was masked by stale incremental build artifacts; caught it from the CI log and reproduced with a clean rebuild.) |
|
Also checked the two remaining PR Submission Checklist boxes — both were already written as honest |
|
Heads-up: this PR can't be cleanly rebased onto post-#4399 |
|
Split by #4399 (harness migration). This PR's backend — the tool-failure classification hook — patched The frontend + i18n half of this PR does NOT conflict with #4399 and rebases cleanly onto current |
…ent (tinyhumansai#4254, backend) Rework of tinyhumansai#4413's backend onto post-tinyhumansai#4399 code. tinyhumansai#4399 deleted the original classification hook (session/agent_tool_exec.rs) and the ProgressReporter trait; the tool loop now runs on the tinyagents crate, so the hook is re-homed on the adapter seam. - New `src/openhuman/tool_status/` module: `classify(error_text, timed_out)` and `describe(class)` map a tool outcome to a `ClassifiedFailure` (class + category + plain-language cause + next action). Pure, unit-tested (16 tests: all classes, timeout/unknown/precedence, category mapping, recoverability). - `AgentProgress::ToolCallCompleted` + `WebChannelEvent` gain a `failure` field carrying that classification to the chat "View processing" timeline. - Core hook + latent-bug fix: the crate's `AgentEvent::ToolCompleted` carries only call_id + tool_name, so the bridge previously hardcoded `success: true` (a failed tool never surfaced). `ToolOutcomeCaptureMiddleware::after_tool` now classifies each result and writes a `call_id → (success, failure)` side-channel (mirroring the existing `ToolNameMap`); the bridge reads it to emit real `success` + `failure`. Absent entry falls back to `(true, None)`. - `progress_bridge` serializes `failure` onto the `tool_result` socket event and the run ledger. Backend only. The frontend consumer (chat timeline "why / next" render) and i18n keys are a companion — they rebase cleanly onto current `main` from tinyhumansai#4413 (only the Rust hook conflicted with tinyhumansai#4399), so they are not re-included here to avoid reverting unrelated locale drift. NOTE: the middleware-before-event ordering that populates the side-channel is crate-internal; the fallback keeps behavior no worse than before if the event is ever projected first. Needs runtime validation once tinyhumansai#4442 lands (drive a deliberately-failing tool and confirm `success:false` + `failure` on the wire).
…ed in tinyhumansai#4445 tinyhumansai#4413 originally shipped both the tool-failure classification backend and the chat "View processing" render. The backend conflicted with tinyhumansai#4399 (harness migration) and has been reworked separately in tinyhumansai#4445 (tool_status module + ToolOutcomeCaptureMiddleware + WebChannelEvent.failure). This reduces tinyhumansai#4413 to just the frontend that consumes tinyhumansai#4445's `failure` field off the tool_result socket event, rebased cleanly onto current main. Kept (21 files): - chat timeline render: ProcessingTranscriptView (+test), ChatRuntimeProvider, chatService, chatRuntimeSlice (+toolFailure test), turnState. - i18n: conversations.toolFailure.* keys across all 14 locales (3-way unioned onto current main so no existing translations are reverted). Dropped (now owned by tinyhumansai#4445): tool_status/, agent progress/observability, middleware, event_bus, socketio failure plumbing, and the deleted-file conflict on agent_tool_exec.rs. Verified: pnpm typecheck clean, i18n:check parity 0 missing/0 extra, 13 frontend tests pass (chatRuntimeSlice.toolFailure + ProcessingTranscriptView). Stacked on tinyhumansai#4445 (merge that first — it emits the field this renders).
d2ef245 to
8e807f5
Compare
… failure in snapshot (tinyhumansai#4459) Repairs tinyhumansai#4413 tool-failure classification. A/B/E: the classifier now short-circuits POLICY_BLOCKED_MARKER and POLICY_DENIED_MARKER *before* the `timed out` sniff, so a user deny → `Denied` and an approval-TTL expiry → `ApprovalExpired` (both `FailureCategory::UserDeclined`, non-retryable, honest copy) instead of a recoverable Unknown "try again" or a Timeout "will retry". Middleware now sniffs both `error` and `content`, and the dead `"policy denied"` needle is removed. C: `ToolTimelineEntry` gains `failure: Option<PersistedToolFailure>` (camelCase, matching the TS `PersistedToolFailure` contract) threaded through `mirror.rs`, so a failed tool's explanation survives thread switch / restart. D: `SubagentToolCallCompleted` gains a `failure` field populated from the already-computed classification (observability.rs) and forwarded on the web event + persisted `SubagentToolCall`, so failed sub-agent rows carry an explanation. Frontend: TS types (`PersistedSubagentToolCall.failure`, `SubagentToolCallEntry.failure`, slice mapping), the localized-failure-class set gains `denied`/`approvalExpired`, and i18n keys land in en.ts + all locales. Tests deferred to the final pass. Claude-Session: https://claude.ai/code/session_019j5TLsRLHsM3kqAYFyH4hR
Summary
The chat "View processing" timeline render for tool-call failures — the frontend half of #4254. It reads the structured
failure(class + plain-language cause + next action) off thetool_resultsocket event and shows a "why / what to do next" pair instead of a raw error string.Scope change (was full-stack → now frontend-only)
This PR originally shipped the backend classifier too, but that conflicted with #4399 (the TinyAgents harness migration) and has been reworked separately in #4445 (
tool_statusmodule +ToolOutcomeCaptureMiddleware+WebChannelEvent.failure). This PR is now reduced to just the frontend consumer, rebased cleanly onto currentmain.What's here (21 files)
ProcessingTranscriptView(+test),ChatRuntimeProvider,chatService,chatRuntimeSlice(+toolFailuretest),turnState.conversations.toolFailure.*keys across all 14 locales (real translations), 3-way unioned onto currentmainso no existing translations are reverted.Validation
pnpm typecheckclean.pnpm i18n:check— 0 missing / 0 extra across all locales (parity holds).chatRuntimeSlice.toolFailure+ProcessingTranscriptView).Merge order
Stacked on #4445 — merge that first; it emits the
WebChannelEvent.failurethis PR renders. (Both currently also inherit the pre-existingmain"Rust E2E (mock backend)" red, which is unrelated to either PR.)