Skip to content

Upgrade Firecrawl tools to firecrawl-py v2 SDK + add Firecrawl Interact tool#6051

Open
rakshith48 wants to merge 7 commits into
crewAIInc:mainfrom
rakshith48:firecrawl-v2-upgrade
Open

Upgrade Firecrawl tools to firecrawl-py v2 SDK + add Firecrawl Interact tool#6051
rakshith48 wants to merge 7 commits into
crewAIInc:mainfrom
rakshith48:firecrawl-v2-upgrade

Conversation

@rakshith48
Copy link
Copy Markdown

@rakshith48 rakshith48 commented Jun 5, 2026

Summary

Two related changes to the CrewAI Firecrawl tools (lib/crewai-tools/):

  1. Upgrade the existing scrape / crawl / search tools from the legacy v1 firecrawl-py SDK to the current v2 SDK (firecrawl-py v4).
  2. Add a new FirecrawlInteractTool wrapping Firecrawl's agentic-browser agent() endpoint.

1. v1 → v2 upgrade (scrape / crawl / search)

  • Client class: from firecrawl import FirecrawlAppfrom firecrawl import Firecrawl (the v2 client). Package + import root unchanged (firecrawl-py / firecrawl).
  • Dependency pin: firecrawl-py>=1.8.0firecrawl-py>=4.0.0,<5 (lib/crewai-tools/pyproject.toml).
  • Crawl ignore_sitemap → v2 sitemap enum (ignore_sitemap=Truesitemap="skip", verified in the SDK).

These tools already used the v2 method names (.scrape()/.crawl()/.search()), snake_case config, and typed returns (Document/CrawlJob/SearchData); this aligns the remaining v1 surface. Public args, _run signatures, and output shapes are unchanged, so existing agents are unaffected.

2. New FirecrawlInteractTool

A 4th tool wrapping Firecrawl's autonomous browser agent (Firecrawl().agent(prompt=..., urls=...)) — it navigates/interacts with pages to complete a natural-language task and returns the result. Follows the exact pattern of the existing tools (config dict + FIRECRAWL_API_KEY env + typed args_schema), validates input URLs via crewai_tools.security.safe_path.validate_url, and is registered in both crewai_tools/__init__.py and crewai_tools/tools/__init__.py. Docs page added at docs/en/tools/web-scraping/firecrawlinteracttool.mdx.

Verification

  • py_compile on all tool files.
  • Every option in each tool's default config binds cleanly to the real v2 method signatures (inspect.signature(...).bind(...)) against firecrawl-py==4.28.2; Firecrawl().agent confirmed to accept prompt/urls/model/max_credits/etc.
  • The three existing Firecrawl VCR tests pass against the committed v2 cassettes (/v2/...).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added FirecrawlInteractTool for autonomous browser interactions and web automation; made it publicly available.
  • Documentation

    • Updated Firecrawl tools docs to point to the new SDK source and reflect Firecrawl v2 "config" options.
    • Added full docs and examples for the new FirecrawlInteractTool.
  • Chores

    • Bumped Firecrawl SDK dependency requirement to firecrawl-py >=4.0.0,<5.

rakshith48 and others added 5 commits June 4, 2026 14:57
Migrate the three Firecrawl tools (scrape, crawl, search) from the legacy
v1 FirecrawlApp client to the v2 Firecrawl client in firecrawl-py >= 4.

- Import `Firecrawl` instead of the legacy `FirecrawlApp` class (the v2
  client). The package name stays `firecrawl-py`; the import root stays
  `firecrawl`.
- Bump the optional dependency pin from `firecrawl-py>=1.8.0` to
  `firecrawl-py>=4.0.0,<5`.
- Crawl tool: replace the deprecated v1 `ignore_sitemap` option with the
  v2 `sitemap` enum (`ignore_sitemap=True` -> `sitemap="skip"`).
- Update the RuntimeError message to refer to the Firecrawl client.

The tools already used the v2 method names (`.scrape`/`.crawl`/`.search`)
and snake_case config keys; this change aligns the client class, pin, and
crawl sitemap option with the v2 SDK. Public tool args, `_run` signatures,
and the typed return shapes (Document / CrawlJob / SearchData) are
preserved, so agents using these tools are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wraps the v2 SDK agent() endpoint as FirecrawlInteractTool: the CrewAI agent
passes a natural-language task (and optional start urls) and Firecrawl's
autonomous browser agent navigates/interacts and returns the result.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0899c14f-027e-4f83-8042-9033d977c917

📥 Commits

Reviewing files that changed from the base of the PR and between 490cb56 and 4750ab1.

📒 Files selected for processing (5)
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_scrape_website_tool/firecrawl_scrape_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py
✅ Files skipped from review due to trivial changes (1)
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_scrape_website_tool/firecrawl_scrape_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py

📝 Walkthrough

Walkthrough

This PR migrates the CrewAI Firecrawl integration from v1 (FirecrawlApp) to v2 (Firecrawl), updates three existing tools with the new client and API, introduces a new autonomous browser agent tool, and bumps the SDK dependency to v4.x.

Changes

Firecrawl v2 Integration Update

Layer / File(s) Summary
Dependency version constraint
lib/crewai-tools/pyproject.toml
Firecrawl SDK dependency updated from >=1.8.0 to >=4.0.0,<5.
Documentation updates for existing tools
docs/en/tools/web-scraping/firecrawl*.mdx
Installation instructions and Arguments sections for crawl, scrape, and search tools updated to reference firecrawl/firecrawl repository and describe v2 API shape (url, config dict, api_key defaults).
Crawl website tool v2 migration
lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py
Replaced FirecrawlApp with Firecrawl client import, changed default config from ignore_sitemap to sitemap: "skip", updated initialization in both normal and post-install paths, and refined _run error handling and model_rebuild import.
Scrape website tool v2 migration
lib/crewai-tools/src/crewai_tools/tools/firecrawl_scrape_website_tool/firecrawl_scrape_website_tool.py
Replaced FirecrawlApp with Firecrawl in availability detection, dynamic imports, and client instantiation; updated on-demand install constraint and _run to validate URL before calling scrape.
Search tool v2 migration
lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py
Replaced FirecrawlApp with Firecrawl in availability checks, initialization, post-install path, and model_rebuild trigger; updated on-demand install constraint and _run error message.
FirecrawlInteractTool implementation and documentation
lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py, lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md, docs/en/tools/web-scraping/firecrawlinteracttool.mdx
New autonomous browser agent tool with Pydantic schema (prompt, optional urls), configurable agent parameters (model spark-1-mini, poll_interval=2), dependency auto-install via click and uv, URL validation, agent(...) invocation, and complete documentation (README and MDX).
Public module exports for new tool
lib/crewai-tools/src/crewai_tools/__init__.py, lib/crewai-tools/src/crewai_tools/tools/__init__.py
FirecrawlInteractTool imported and added to __all__ in both top-level and tools subpackage for public API exposure.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through docs and client code too,

Upgraded the SDK and added a new view,
Tools chatter with Firecrawl in line,
Configs set, exports align,
A cheerful rabbit cheers: "All tests pass — fine!"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: upgrading Firecrawl tools to the v2 SDK and adding the new Firecrawl Interact tool. It is concise, clear, and directly reflects the primary objectives of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py (1)

49-63: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same config=None issue as the crawl tool.

The config field allows None, but **self.config in _run() will fail if None is passed explicitly.

🛡️ Proposed fix
         return self._firecrawl.search(
             query=query,
-            **self.config,
+            **(self.config or {}),
         )

Also applies to: 113-116

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py`
around lines 49 - 63, The config Field in FirecrawlSearchTool (declared as
config: dict[str, Any] | None = Field(...)) can be None which will cause the
unpacking **self.config in the _run() method to raise; change the runtime to
ensure _run() treats a None config safely by using a non-None default (e.g.,
local_config = self.config or {}) before unpacking or by coercing self.config to
the default dict structure; update usages in _run() (and the similar block
around lines 113-116) to reference local_config (or merge with the default dict)
so **local_config never receives None.
lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py (1)

50-64: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Potential TypeError if config=None is explicitly passed.

The config field is typed as dict[str, Any] | None, but **self.config in _run() will raise a TypeError if config is None. While the default_factory provides a dict, an explicit config=None argument would bypass it.

🛡️ Proposed fix
     def _run(self, url: str) -> Any:
         if not self._firecrawl:
             raise RuntimeError("Firecrawl client not properly initialized")

         url = validate_url(url)
-        return self._firecrawl.crawl(url=url, poll_interval=2, **self.config)
+        return self._firecrawl.crawl(url=url, poll_interval=2, **(self.config or {}))

Also applies to: 112-112

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py`
around lines 50 - 64, The config field is declared as dict[str, Any] | None but
_run() uses **self.config which will TypeError if config is explicitly None;
update the code so before expanding self.config in _run() you coerce None to the
default dict (the same structure used in the Field default_factory) or to an
empty dict, e.g. compute local_config = self.config or { ...default... } and use
**local_config, or validate/normalize self.config in the model initializer;
refer to the config Field and the _run method / self.config usage to locate
where to add this coercion/validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py`:
- Around line 91-98: The subprocess call that installs firecrawl-py uses an
unpinned package name (subprocess.run(["uv", "add", "firecrawl-py"], ...)) which
can pull a mismatched major release; update this invocation to pin the major
range declared in pyproject.toml (e.g. "firecrawl-py>=4.0.0,<5") so the runtime
install matches the declared dependency, and apply the same change to the
analogous subprocess.run calls in firecrawl_scrape_website_tool.py,
firecrawl_search_tool.py, and firecrawl_crawl_website_tool.py.

In `@lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md`:
- Around line 12-14: The fenced code block containing the pip command is missing
a language tag; update the triple-backtick fence around "pip install
firecrawl-py 'crewai[tools]'" to include a shell/bash language specifier (e.g.,
change ``` to ```shell) so the block is properly tagged for syntax/linter
(MD040) in the README for the Firecrawl interact tool.

---

Outside diff comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py`:
- Around line 50-64: The config field is declared as dict[str, Any] | None but
_run() uses **self.config which will TypeError if config is explicitly None;
update the code so before expanding self.config in _run() you coerce None to the
default dict (the same structure used in the Field default_factory) or to an
empty dict, e.g. compute local_config = self.config or { ...default... } and use
**local_config, or validate/normalize self.config in the model initializer;
refer to the config Field and the _run method / self.config usage to locate
where to add this coercion/validation.

In
`@lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py`:
- Around line 49-63: The config Field in FirecrawlSearchTool (declared as
config: dict[str, Any] | None = Field(...)) can be None which will cause the
unpacking **self.config in the _run() method to raise; change the runtime to
ensure _run() treats a None config safely by using a non-None default (e.g.,
local_config = self.config or {}) before unpacking or by coercing self.config to
the default dict structure; update usages in _run() (and the similar block
around lines 113-116) to reference local_config (or merge with the default dict)
so **local_config never receives None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 06e1f53e-75e4-4d94-abfb-2ceb67b71300

📥 Commits

Reviewing files that changed from the base of the PR and between 906cd97 and 490cb56.

📒 Files selected for processing (13)
  • docs/en/tools/web-scraping/firecrawlcrawlwebsitetool.mdx
  • docs/en/tools/web-scraping/firecrawlinteracttool.mdx
  • docs/en/tools/web-scraping/firecrawlscrapewebsitetool.mdx
  • docs/en/tools/web-scraping/firecrawlsearchtool.mdx
  • lib/crewai-tools/pyproject.toml
  • lib/crewai-tools/src/crewai_tools/__init__.py
  • lib/crewai-tools/src/crewai_tools/tools/__init__.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_crawl_website_tool/firecrawl_crawl_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/__init__.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/firecrawl_interact_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_scrape_website_tool/firecrawl_scrape_website_tool.py
  • lib/crewai-tools/src/crewai_tools/tools/firecrawl_search_tool/firecrawl_search_tool.py

Comment thread lib/crewai-tools/src/crewai_tools/tools/firecrawl_interact_tool/README.md Outdated
Addresses CodeRabbit review on crewAIInc#6051:
- Pin the on-demand `uv add firecrawl-py` (+ error hint) to >=4.0.0,<5 in all
  four tools, matching pyproject and the v2-only `Firecrawl` client.
- Add docstrings to the schema classes, __init__, _run, and
  _initialize_firecrawl (docstring-coverage check).
- Tag the README install code fence as shell (MD040).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rakshith48
Copy link
Copy Markdown
Author

Thanks @coderabbitai — addressed all three:

Pin the on-demand uv add — agreed. The auto-install fallback (and its error hint) now pin firecrawl-py>=4.0.0,<5 in all four tools, matching pyproject.toml and the v2-only Firecrawl client (an unpinned install could otherwise pull a future v5 or a v1 that lacks the Firecrawl class).

README code-fence language — tagged the install block as ```shell (MD040).

Docstring coverage — added docstrings to the schema classes, __init__, _run, and _initialize_firecrawl across the four tools.

All four tool modules py_compile cleanly.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@rakshith48 Thanks for the detailed update — all three look solid:

  • Version-pinned auto-install (firecrawl-py>=4.0.0,<5) across all four tools keeps the on-demand path consistent with pyproject.toml and avoids surprises from a future v5 or a stale v1.
  • ```shell fence on the README install block takes care of MD040.
  • Docstring coverage on the schema, __init__, _run, and _initialize_firecrawl methods rounds out the documentation nicely.

Glad it all py_compiles cleanly. 🎉

(๑˃ᴗ˂)ﻌ ♡ 🐇


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant