fix: keep binary tool I/O out of the LLM context with FileArtifact#6047
fix: keep binary tool I/O out of the LLM context with FileArtifact#6047mattatcha wants to merge 9 commits into
Conversation
Binary file data (PPTX, PDF, images) returned by one tool and echoed by the model as a base64 argument to another tool drifts by a few characters, invalidating the base64 and corrupting the resulting file. Tools can now return a FileArtifact instead of a base64 string. The agent executor stores the bytes out-of-band (execution-scoped, TTL-backed) and shows the model a short, namespaced `crewai+file://` handle, expanding it back to exact base64 just before the consuming tool runs. Wired into the native (default AgentExecutor + legacy) and ReAct execution paths with per-run cleanup.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds FileArtifact-based out-of-band binary passing: in-memory scoped store with TTL, model-facing placeholders, resolver/store helpers, wiring into native and structured tool execution (including cached results), crew-scoped cleanup, and tests validating end-to-end behavior. ChangesBinary Artifact Out-of-Band Passing for Tools
Sequence Diagram(s)sequenceDiagram
participant ProducerTool
participant Executor
participant ArtifactStore
participant Model
participant ConsumerTool
ProducerTool->>Executor: return FileArtifact(bytes, filename, mime)
Executor->>ArtifactStore: store_artifact(artifact, scope_id)
ArtifactStore-->>Executor: placeholder (crewai+file://uuid + metadata)
Executor->>Model: include placeholder string in model context
Model-->>Executor: model-triggered consumer tool call with crewei+file://uuid
Executor->>ArtifactStore: resolve_artifact_handles(arguments)
ArtifactStore-->>Executor: arguments with base64 payload
Executor->>ConsumerTool: invoke consumer tool with resolved bytes
Note over Executor: At crew end
Executor->>ArtifactStore: clear_artifact_scope(crew_id)
ArtifactStore->>ArtifactStore: delete crew-scoped artifacts
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The handle regex matches hex case-insensitively, but store keys are lowercase uuid4 strings. Normalize the captured uuid to lowercase before lookup so a handle echoed by the model with uppercase hex still resolves to its bytes instead of leaking the raw token to the downstream tool.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/crewai/src/crewai/crew.py (1)
1-1:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winFix formatting issues before merge.
The pipeline is failing because the code needs to be formatted with ruff. Run the following command to fix:
uv run ruff format lib/🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/crew.py` at line 1, The file lib/crewai/src/crewai/crew.py (starts with "from __future__ import annotations") has formatting issues flagged by ruff; run the formatter command (uv run ruff format lib/) to auto-fix formatting across lib/, verify crew.py and any changed files are properly formatted, then stage and commit the formatted changes before merging.Source: Pipeline failures
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/tools/file_artifact.py`:
- Around line 23-240: The file lib/crewai/src/crewai/tools/file_artifact.py
fails ruff formatting; run the project formatter (e.g., ruff format) on this
module and commit the resulting changes so CI passes. Focus on formatting the
whole file (classes FileArtifact, _ArtifactStore, functions store_artifact,
resolve_artifact_handles, store_if_artifact, clear_artifact_scope,
artifact_scope_id, and _human_size) and then re-run ruff --check to verify no
further edits are required.
- Around line 95-152: _Entry currently only records stored_at so TTL from the
most recent store() call is used for pruning and resolve() never enforces
expiry; add a ttl: int field to _Entry, set it in _ArtifactStore.store (store
the passed ttl per entry), update _prune_locked to compare each entry.stored_at
+ entry.ttl against the cutoff (or check per-entry expiry) rather than using the
incoming ttl, and make _ArtifactStore.resolve check the entry's ttl/stored_at
and return None (and optionally delete the entry) when it is expired; leave
clear_scope behavior unchanged.
In `@lib/crewai/src/crewai/tools/tool_usage.py`:
- Around line 331-337: The file has formatting issues causing ruff format
--check to fail; run ruff (or your project's formatter) to reformat this module,
focusing on the sections around resolve_artifact_handles, calling.arguments, and
tool.ainvoke usages (the artifact-handling blocks and the other indicated
regions) so the code meets ruff style rules and CI passes; after formatting,
re-run ruff format --check to verify the module is clean.
In `@lib/crewai/tests/tools/test_file_artifact.py`:
- Around line 217-223: Run the Ruff formatter over the repository (or at least
this file) to fix lint/formatting errors causing CI to fail; e.g., run `uv run
ruff format lib/crewai/tests/tools/test_file_artifact.py` (or `uv run ruff
format lib/`) and commit the changes so the block that constructs the
SimpleNamespace tool_call and the call to
executor._execute_single_native_tool_call is reformatted to satisfy Ruff. Ensure
the SimpleNamespace instantiation (tool_call, function, arguments) adheres to
the project's formatting rules after running Ruff.
---
Outside diff comments:
In `@lib/crewai/src/crewai/crew.py`:
- Line 1: The file lib/crewai/src/crewai/crew.py (starts with "from __future__
import annotations") has formatting issues flagged by ruff; run the formatter
command (uv run ruff format lib/) to auto-fix formatting across lib/, verify
crew.py and any changed files are properly formatted, then stage and commit the
formatted changes before merging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d08f95f3-8469-4030-845e-9570b02a20cb
📒 Files selected for processing (7)
lib/crewai/src/crewai/agents/crew_agent_executor.pylib/crewai/src/crewai/crew.pylib/crewai/src/crewai/experimental/agent_executor.pylib/crewai/src/crewai/tools/__init__.pylib/crewai/src/crewai/tools/file_artifact.pylib/crewai/src/crewai/tools/tool_usage.pylib/crewai/tests/tools/test_file_artifact.py
| tool_call = SimpleNamespace( | ||
| id="c", | ||
| function=SimpleNamespace( | ||
| name=func_name, arguments=args if isinstance(args, str) else json.dumps(args) | ||
| ), | ||
| ) | ||
| return executor._execute_single_native_tool_call(tool_call) |
There was a problem hiding this comment.
Run Ruff formatting to unblock CI.
lint-run is currently failing on formatting; this block (notably Line 220) is a likely reformat target. Please run uv run ruff format lib/ (or at least this file) before merge.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/tests/tools/test_file_artifact.py` around lines 217 - 223, Run the
Ruff formatter over the repository (or at least this file) to fix
lint/formatting errors causing CI to fail; e.g., run `uv run ruff format
lib/crewai/tests/tools/test_file_artifact.py` (or `uv run ruff format lib/`) and
commit the changes so the block that constructs the SimpleNamespace tool_call
and the call to executor._execute_single_native_tool_call is reformatted to
satisfy Ruff. Ensure the SimpleNamespace instantiation (tool_call, function,
arguments) adheres to the project's formatting rules after running Ruff.
Source: Pipeline failures
- Store each artifact's own expiry (expires_at) instead of a shared stored_at; prune by per-entry expiry on store and enforce it on resolve. Previously a short-TTL store could evict long-TTL entries and expired handles stayed resolvable until the next write. - Derive the artifact scope from the executor's crew or the agent's crew so the native and ReAct paths agree with the crew-id Crew passes to clear_artifact_scope, preventing crew-bound artifacts from lingering under a task scope until TTL.
- Wire resolve_artifact_handles/store_if_artifact into agent_utils.execute_single_native_tool_call, the native tool path used by StepExecutor (Plan-and-Execute mode); previously a FileArtifact produced or consumed there bypassed the bypass and leaked raw bytes. - Fold the crew/agent.crew fallback into artifact_scope_id(crew, task, agent) so all four execution paths derive the cleanup scope identically instead of repeating 'self.crew or getattr(self.agent, ...)'. - Sanitize ']' and newlines (not just quotes) in the placeholder so an odd filename can't break the bracketed attribute line; extend _human_size with TB/PB.
The tool-result cache stores the raw FileArtifact and hands back the same instance on every cache hit, where store_if_artifact ran again and minted a fresh handle plus another byte copy. Repeated identical cached calls therefore stacked duplicate copies of the same payload until scope/TTL cleanup. The store now reuses an existing handle when the same FileArtifact instance is stored again under the same scope, keying off object identity (the cache keeps the object alive, so the id is stable within a run).
store_if_artifact ran before the after_tool_call hooks, so a hook that replaced the result with a FileArtifact put raw bytes / a dataclass repr into the tool message and events. Re-run store_if_artifact on the final result after the hook loop in all three native tool paths (no-op for the normal string case).
Keying the object-identity dedup map on id(artifact) alone could orphan a mapping if the same instance were ever stored under two scopes (the second store overwrote the first's entry, and per-handle cleanup then skipped it). Key on (id(artifact), scope) so each scope keeps its own handle and cleanup is exact and unconditional. Also clear _handle_by_obj in the test fixture so stale id->handle mappings don't accumulate across the session.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ad798af. Configure here.

Summary
LLMs can't reproduce long opaque strings byte-perfect, so a binary file routed through a tool-call argument — e.g. a generated PPTX/PDF passed as base64
contentto an upload connector — drifts a few characters, invalidates the base64, and corrupts the uploaded file.This adds
FileArtifact: a tool returns it instead of a base64 string, and the agent executor keeps the bytes out of the model context behind a short, namespacedcrewai+file://handle, expanding it back to exact base64 just before the consuming tool runs. The bytes never enter the model, so they can't be corrupted.Key Changes
crewai.tools.FileArtifact+ an execution-scoped, TTL-backed store with namespaced handles (resolution only ever fires on tokens we mint, never on user data).AgentExecutorand the legacyCrewAgentExecutor) and the ReAct path: store-on-output, resolve-handle-on-input, including cache hits. Original args are never mutated, so events/cache/logs keep the short handle.Usage
The consuming connector tool needs no change — the executor resolves the handle into its
contentargument generically.Note
Medium Risk
Touches all tool execution paths and in-memory artifact lifecycle; incorrect handle resolution or scope cleanup could break producer→consumer tool chains or leak memory, but changes are additive and well-tested.
Overview
Adds
FileArtifactso tools can pass binary data without routing base64 through the LLM. When a tool returns aFileArtifact, executors store bytes in a process-local, crew-scoped store and surface a shortcrewai+file://handle to the model;resolve_artifact_handlesexpands those handles to base64 only at tool invoke time, while events, cache keys, and logged args keep the handle unchanged.Wiring spans legacy
CrewAgentExecutor, experimentalAgentExecutor, ReAct (agent_utils), andToolUsage(sync/async invoke +_format_result). Crew kickoff (kickoff/akickoff) callsclear_artifact_scopealongside existing file cleanup.FileArtifactis exported fromcrewai.tools. Newtest_file_artifact.pycovers store/resolve/scoping/TTL and both native executors end-to-end.Reviewed by Cursor Bugbot for commit 0fba816. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit