Upgrade Firecrawl tools to firecrawl-py v2 SDK + add Firecrawl Interact tool#6051
Upgrade Firecrawl tools to firecrawl-py v2 SDK + add Firecrawl Interact tool#6051rakshith48 wants to merge 7 commits into
Conversation
Migrate the three Firecrawl tools (scrape, crawl, search) from the legacy v1 FirecrawlApp client to the v2 Firecrawl client in firecrawl-py >= 4. - Import `Firecrawl` instead of the legacy `FirecrawlApp` class (the v2 client). The package name stays `firecrawl-py`; the import root stays `firecrawl`. - Bump the optional dependency pin from `firecrawl-py>=1.8.0` to `firecrawl-py>=4.0.0,<5`. - Crawl tool: replace the deprecated v1 `ignore_sitemap` option with the v2 `sitemap` enum (`ignore_sitemap=True` -> `sitemap="skip"`). - Update the RuntimeError message to refer to the Firecrawl client. The tools already used the v2 method names (`.scrape`/`.crawl`/`.search`) and snake_case config keys; this change aligns the client class, pin, and crawl sitemap option with the v2 SDK. Public tool args, `_run` signatures, and the typed return shapes (Document / CrawlJob / SearchData) are preserved, so agents using these tools are unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wraps the v2 SDK agent() endpoint as FirecrawlInteractTool: the CrewAI agent passes a natural-language task (and optional start urls) and Firecrawl's autonomous browser agent navigates/interacts and returns the result. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughThis PR migrates the CrewAI Firecrawl integration from v1 (FirecrawlApp) to v2 (Firecrawl), updates three existing tools with the new client and API, introduces a new autonomous browser agent tool, and bumps the SDK dependency to v4.x. ChangesFirecrawl v2 Integration Update
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py (1)
49-63:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSame
config=Noneissue as the crawl tool.The
configfield allowsNone, but**self.configin_run()will fail ifNoneis passed explicitly.🛡️ Proposed fix
return self._firecrawl.search( query=query, - **self.config, + **(self.config or {}), )Also applies to: 113-116
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py` around lines 49 - 63, The config Field in FirecrawlSearchTool (declared as config: dict[str, Any] | None = Field(...)) can be None which will cause the unpacking **self.config in the _run() method to raise; change the runtime to ensure _run() treats a None config safely by using a non-None default (e.g., local_config = self.config or {}) before unpacking or by coercing self.config to the default dict structure; update usages in _run() (and the similar block around lines 113-116) to reference local_config (or merge with the default dict) so **local_config never receives None.lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py (1)
50-64:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPotential
TypeErrorifconfig=Noneis explicitly passed.The
configfield is typed asdict[str, Any] | None, but**self.configin_run()will raise aTypeErrorif config isNone. While the default_factory provides a dict, an explicitconfig=Noneargument would bypass it.🛡️ Proposed fix
def _run(self, url: str) -> Any: if not self._firecrawl: raise RuntimeError("Firecrawl client not properly initialized") url = validate_url(url) - return self._firecrawl.crawl(url=url, poll_interval=2, **self.config) + return self._firecrawl.crawl(url=url, poll_interval=2, **(self.config or {}))Also applies to: 112-112
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py` around lines 50 - 64, The config field is declared as dict[str, Any] | None but _run() uses **self.config which will TypeError if config is explicitly None; update the code so before expanding self.config in _run() you coerce None to the default dict (the same structure used in the Field default_factory) or to an empty dict, e.g. compute local_config = self.config or { ...default... } and use **local_config, or validate/normalize self.config in the model initializer; refer to the config Field and the _run method / self.config usage to locate where to add this coercion/validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py`:
- Around line 91-98: The subprocess call that installs firecrawl-py uses an
unpinned package name (subprocess.run(["uv", "add", "firecrawl-py"], ...)) which
can pull a mismatched major release; update this invocation to pin the major
range declared in pyproject.toml (e.g. "firecrawl-py>=4.0.0,<5") so the runtime
install matches the declared dependency, and apply the same change to the
analogous subprocess.run calls in firecrawl_scrape_website_tool.py,
firecrawl_search_tool.py, and firecrawl_crawl_website_tool.py.
In `@lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md`:
- Around line 12-14: The fenced code block containing the pip command is missing
a language tag; update the triple-backtick fence around "pip install
firecrawl-py 'crewai[tools]'" to include a shell/bash language specifier (e.g.,
change ``` to ```shell) so the block is properly tagged for syntax/linter
(MD040) in the README for the Firecrawl interact tool.
---
Outside diff comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py`:
- Around line 50-64: The config field is declared as dict[str, Any] | None but
_run() uses **self.config which will TypeError if config is explicitly None;
update the code so before expanding self.config in _run() you coerce None to the
default dict (the same structure used in the Field default_factory) or to an
empty dict, e.g. compute local_config = self.config or { ...default... } and use
**local_config, or validate/normalize self.config in the model initializer;
refer to the config Field and the _run method / self.config usage to locate
where to add this coercion/validation.
In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py`:
- Around line 49-63: The config Field in FirecrawlSearchTool (declared as
config: dict[str, Any] | None = Field(...)) can be None which will cause the
unpacking **self.config in the _run() method to raise; change the runtime to
ensure _run() treats a None config safely by using a non-None default (e.g.,
local_config = self.config or {}) before unpacking or by coercing self.config to
the default dict structure; update usages in _run() (and the similar block
around lines 113-116) to reference local_config (or merge with the default dict)
so **local_config never receives None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 06e1f53e-75e4-4d94-abfb-2ceb67b71300
📒 Files selected for processing (13)
docs/en/tools/web-scraping/firecrawlcrawlwebsitetool.mdxdocs/en/tools/web-scraping/firecrawlinteracttool.mdxdocs/en/tools/web-scraping/firecrawlscrapewebsitetool.mdxdocs/en/tools/web-scraping/firecrawlsearchtool.mdxlib/crewai-tools/pyproject.tomllib/crewai-tools/src/crewai_tools/__init__.pylib/crewai-tools/src/crewai_tools/tools/__init__.pylib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.pylib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.mdlib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/__init__.pylib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.pylib/crewai-tools/src/crewai_tools/tools/firecrawl_scrape_website_tool/firecrawl_scrape_website_tool.pylib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py
Addresses CodeRabbit review on crewAIInc#6051: - Pin the on-demand `uv add firecrawl-py` (+ error hint) to >=4.0.0,<5 in all four tools, matching pyproject and the v2-only `Firecrawl` client. - Add docstrings to the schema classes, __init__, _run, and _initialize_firecrawl (docstring-coverage check). - Tag the README install code fence as shell (MD040). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks @coderabbitai — addressed all three: ✅ Pin the on-demand ✅ README code-fence language — tagged the install block as ✅ Docstring coverage — added docstrings to the schema classes, All four tool modules |
|
Tip For best results, initiate chat on the files or code changes.
Glad it all (๑˃ᴗ˂)ﻌ ♡ 🐇 If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
Summary
Two related changes to the CrewAI Firecrawl tools (
lib/crewai-tools/):firecrawl-pySDK to the current v2 SDK (firecrawl-py v4).FirecrawlInteractToolwrapping Firecrawl's agentic-browseragent()endpoint.1. v1 → v2 upgrade (scrape / crawl / search)
from firecrawl import FirecrawlApp→from firecrawl import Firecrawl(the v2 client). Package + import root unchanged (firecrawl-py/firecrawl).firecrawl-py>=1.8.0→firecrawl-py>=4.0.0,<5(lib/crewai-tools/pyproject.toml).ignore_sitemap→ v2sitemapenum (ignore_sitemap=True≡sitemap="skip", verified in the SDK).These tools already used the v2 method names (
.scrape()/.crawl()/.search()), snake_case config, and typed returns (Document/CrawlJob/SearchData); this aligns the remaining v1 surface. Public args,_runsignatures, and output shapes are unchanged, so existing agents are unaffected.2. New
FirecrawlInteractToolA 4th tool wrapping Firecrawl's autonomous browser agent (
Firecrawl().agent(prompt=..., urls=...)) — it navigates/interacts with pages to complete a natural-language task and returns the result. Follows the exact pattern of the existing tools (config dict +FIRECRAWL_API_KEYenv + typedargs_schema), validates input URLs viacrewai_tools.security.safe_path.validate_url, and is registered in bothcrewai_tools/__init__.pyandcrewai_tools/tools/__init__.py. Docs page added atdocs/en/tools/web-scraping/firecrawlinteracttool.mdx.Verification
py_compileon all tool files.configbinds cleanly to the real v2 method signatures (inspect.signature(...).bind(...)) againstfirecrawl-py==4.28.2;Firecrawl().agentconfirmed to acceptprompt/urls/model/max_credits/etc./v2/...).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores