fix(opensearch): set pool_maxsize so the shared client keeps a real connection pool by ef-rintaro · Pull Request #15682 · infiniflow/ragflow

ef-rintaro · 2026-06-05T01:52:28Z

What problem does this PR solve?

On the OpenSearch backend, OSConnection (a process-wide @singleton) builds its
client without pool_maxsize. In opensearch-py 2.7.1 the underlying urllib3
HTTPConnectionPool then falls back to maxsize=1
(opensearchpy/connection/http_urllib3.py: maxsize is only set when
pool_maxsize is a truthy int). Because the one client is shared by every
concurrent consumer -- sync Quart views run in a thread pool, the task executor's
asyncio.gather fan-out, and the cluster.health() probe -- requests serialize
on a single HTTP connection and urllib3 logs:

Connection pool is full, discarding connection: <endpoint>. Connection pool size: 1

Each discard forces a fresh TLS handshake, degrading throughput and latency.

The Elasticsearch backend does not hit this: elastic-transport defaults to
connections_per_node=10 (elastic_transport/_models.py), so its shared client
already keeps a real pool. This is purely an OpenSearch-vs-Elasticsearch default
asymmetry.

Fix

Pass pool_maxsize to the OpenSearch client so the shared singleton keeps a real
connection pool, matching the Elasticsearch backend. Constructor-only change; no
behavior change for single-threaded callers.

Type of change

Bug Fix (non-breaking change which fixes an issue)

Affected backends

OpenSearch only.

coderabbitai · 2026-06-05T01:52:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 897baec0-1b40-44af-bd41-80bbf79b66e4

📥 Commits

Reviewing files that changed from the base of the PR and between a36f64a and 50a7311.

📒 Files selected for processing (1)

rag/utils/opensearch_conn.py

📝 Walkthrough

Walkthrough

OpenSearch client initialization is updated to add pool_maxsize=10 to the OpenSearch(...) constructor while keeping timeout=600, documenting urllib3's default effective maxsize=1 and reducing pool contention for the shared singleton client.

Changes

OpenSearch Connection Pool Configuration

Layer / File(s)	Summary
OpenSearch connection pool sizing `rag/utils/opensearch_conn.py`	OpenSearch client initialization adds `pool_maxsize=10` to expand the HTTP connection pool for concurrent thread access, preserving the existing `timeout=600` setting.

🎯 2 (Simple) | ⏱️ ~5 minutes

"I'm a rabbit by the server stream,
I widened pools so threads may dream,
Ten lanes now run, no single-file queue,
OpenSearch hums — concurrency anew! 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly describes the main change: setting pool_maxsize for the OpenSearch shared client to maintain a real connection pool.
Description check	✅ Passed	The PR description comprehensively covers the problem, root cause, fix, and change type; all required template sections are present and well-documented.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…onnection pool

coderabbitai

🧹 Nitpick comments (1)

rag/utils/opensearch_conn.py (1)
76-85: Confirm pool_maxsize support; keep 32 but justify/make configurable

pool_maxsize is a supported OpenSearch() constructor parameter in opensearch-py (it feeds the underlying HTTP/urllib3 connection pool sizing); default is typically around 10 when unset.

Repo search shows only this OpenSearch( instantiation in rag/utils/opensearch_conn.py, so the change is localized.

Since 32 is materially above the typical default, ensure the PR rationale (and/or load/concurrency assumptions) is sufficient, or consider making the value configurable rather than hard-coded.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rag/utils/opensearch_conn.py` around lines 76 - 85, The OpenSearch
constructor is being passed pool_maxsize=32 (in the OpenSearch(...) call) which
is supported but higher than typical defaults; change this to a configurable
value instead of a hard-coded 32 by reading a config or environment variable
(e.g., OPENSEARCH_POOL_MAXSIZE) with a sensible default of 32, validate/coerce
it to an int, and pass that variable into the OpenSearch(..., pool_maxsize=...)
parameter; update any docstring or comment in rag/utils/opensearch_conn.py to
justify the default and note that it can be tuned for load/concurrency.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@rag/utils/opensearch_conn.py`:
- Around line 76-85: The OpenSearch constructor is being passed pool_maxsize=32
(in the OpenSearch(...) call) which is supported but higher than typical
defaults; change this to a configurable value instead of a hard-coded 32 by
reading a config or environment variable (e.g., OPENSEARCH_POOL_MAXSIZE) with a
sensible default of 32, validate/coerce it to an int, and pass that variable
into the OpenSearch(..., pool_maxsize=...) parameter; update any docstring or
comment in rag/utils/opensearch_conn.py to justify the default and note that it
can be tuned for load/concurrency.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 35a17e44-1b7c-4218-baf8-c76f93c4b9a3

📥 Commits

Reviewing files that changed from the base of the PR and between 794c1f4 and a36f64a.

📒 Files selected for processing (1)

rag/utils/opensearch_conn.py

dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. 🐞 bug Something isn't working, pull request that fix bug. labels Jun 5, 2026

fix(opensearch): set pool_maxsize so the shared client keeps a real c…

50a7311

…onnection pool

ef-rintaro force-pushed the fix/opensearch-pool-maxsize branch from a36f64a to 50a7311 Compare June 5, 2026 01:54

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(opensearch): set pool_maxsize so the shared client keeps a real connection pool#15682

fix(opensearch): set pool_maxsize so the shared client keeps a real connection pool#15682
ef-rintaro wants to merge 1 commit into
infiniflow:mainfrom
ef-rintaro:fix/opensearch-pool-maxsize

ef-rintaro commented Jun 5, 2026

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ef-rintaro commented Jun 5, 2026

What problem does this PR solve?

Fix

Type of change

Affected backends

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading