Skip to content

feat: add Tavily Extract as configurable alternative to Jina Reader#4

Open
tavily-integrations wants to merge 1 commit into
MiniMax-AI:mainfrom
Tavily-FDE:feat/tavily-migration/jina-browse-additive
Open

feat: add Tavily Extract as configurable alternative to Jina Reader#4
tavily-integrations wants to merge 1 commit into
MiniMax-AI:mainfrom
Tavily-FDE:feat/tavily-migration/jina-browse-additive

Conversation

@tavily-integrations

Copy link
Copy Markdown

Summary

Adds Tavily Extract as a configurable alternative to Jina Reader for URL content extraction in the browse workflow. This is an additive change — existing Jina Reader functionality is fully preserved.

What changed

  • minimax_search_browse.py: Added read_tavily(url) function using TavilyClient.extract() to fetch page content. Updated get_browse_results() with provider-selection logic that checks BROWSE_PROVIDER env var and TAVILY_API_KEY presence.
  • pyproject.toml: Added tavily-python>=0.3.0 dependency.

Provider selection logic

  1. If BROWSE_PROVIDER=tavily → use Tavily
  2. If BROWSE_PROVIDER=jina → use Jina
  3. If neither is set but TAVILY_API_KEY is present → use Tavily
  4. Otherwise → use Jina (default)

Dependency changes

  • Added tavily-python>=0.3.0 to pyproject.toml

Environment variable changes

  • TAVILY_API_KEY (new, optional) — activates Tavily Extract path
  • BROWSE_PROVIDER (new, optional) — set to tavily or jina to explicitly choose provider
  • JINA_API_KEY (existing, unchanged)

Notes for reviewers

  • Tavily Extract returns raw_content (markdown/text) per URL, matching the format Jina Reader returns
  • Downstream get_browse_answer() requires no changes as it receives the same plain-text string
  • Tavily Extract may have different rate limits and content length caps compared to Jina Reader

Automated Review

  • Passed after 1 attempt(s)
  • Final review: The implementation correctly adds Tavily Extract as an additive, configurable alternative to Jina Reader for URL content extraction. The provider-selection logic is sound, the Tavily SDK is used correctly, existing Jina functionality is preserved, and only the expected files are modified. Two minor issues noted: a potential duplicate tavily-python dependency entry when merged with the prerequisite unit, and repeated TavilyClient instantiation on every call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant