feat(oss): opt-in parallel entity-boost in Memory.search() — 2-4x recall speedup by DmitryPogodaev · Pull Request #5046 · mem0ai/mem0

DmitryPogodaev · 2026-05-02T12:33:36Z

feat(oss): opt-in parallel entity-boost in Memory.search() — 2-4x recall speedup with remote embedders

Problem

Memory.search() runs the entity-boost computation as a sequential for...await loop:

for (const entity of deduped) {
  const entityEmbedding = await this.embedder.embed(entity.text);  // SEQUENTIAL
  const matches = await entityStore.search(entityEmbedding, 500, ...);
  // accumulate boosts...
}

With remote embedders this becomes the dominant latency. We measured a single
recall for an entity-rich query (9 entities) taking ~6.5 seconds, with
~95% of that time spent waiting on serial embed RTTs.

The block has up to 8 iterations (.slice(0, 8)) plus the initial query embed,
so worst case is 9 sequential embedder.embed() calls per search.

This regressed real production latency for us when v3.0.0 added the multi-signal
hybrid retrieval (entity boost). Before v3.0.0 a single Memory.search() did
exactly one embed call.

Fix

Add an opt-in config flag parallelEntityBoost (default: false, preserves
upstream behavior). When true, the entity-boost loop runs via
Promise.all(deduped.map(...)).

Safety: each iteration writes to entityBoosts[memId] = Math.max(prev, boost).
This is order-independent and safe under JS's single-threaded event loop —
interleaved Promise resolutions cannot race because each Math.max + assign
is one synchronous block per microtask.

Default kept at false to:

preserve back-compat for users with rate-limited embedders (e.g. tight
OpenAI plans, single-slot ollama)
avoid surprise behavior change in 1-1 patch upgrades

Users with parallel-friendly embedders (managed services, multi-slot ollama,
batched embedder backends) opt in via:

const memory = new Memory({
  ...,
  parallelEntityBoost: true,
});

Measurements

Reproduced on production setup: ollama embed:latest (qwen3-embedding 4B Q4) on
RTX 5090 with OLLAMA_NUM_PARALLEL=2, accessed via WireGuard tunnel
(~218ms RTT), through a multi-threaded HTTP proxy.

Same prompts, same Qdrant collection (~15k memories), same gateway process —
only difference is parallelEntityBoost flag flipped:

prompt (chars)	embed_calls	sequential ms	parallel ms	speedup
320 (entity-rich)	7	5852	2089	2.80x
475 (entity-rich)	9	6561	1595	4.11x
821 (long+entity)	9	6595	2557	2.58x
633 (entity-rich, agent dev)	9	6669	2540	2.63x

Sequential embed_ms (sum of all per-call durations) ≈ wallclock total_ms
in baseline (each call blocks the next). Parallel embed_ms (sum) is
2-3x larger than wallclock — direct evidence the calls overlap.

For prompts that extract no entities (≤1 embed call, common for short
conversational queries), behavior is identical — no regression possible
since the patched branch is only entered when deduped.length > 0.

Files changed

mem0-ts/src/oss/src/types/index.ts — add parallelEntityBoost to
MemoryConfigSchema
mem0-ts/src/oss/src/config/manager.ts — propagate the flag through
ConfigManager.mergeConfig (default false)
mem0-ts/src/oss/src/memory/index.ts — gate the entity-boost loop on
this.config.parallelEntityBoost

Backward compatibility

Default is false → identical to current behavior
Flag is optional in schema → existing configs validate unchanged
No public API surface change for users who don't opt in
No dependency changes

… speedup) Adds Memory config flag `parallelEntityBoost` (default false). When true, the entity-boost embed+search loop in Memory.search() runs concurrently via Promise.all instead of sequentially. With remote embedders this turns N+1 sequential RTTs into ~1 RTT. Measured on production setup (ollama embed:latest, 9-entity query): - sequential: 6595ms - parallel: 2089ms (3.16x speedup) Safety: per-iteration writes go to entityBoosts[memId] = Math.max(prev, boost) which is order-independent under interleaved single-threaded JS writes. Default kept at false to preserve back-compat for users with rate-limited or single-slot embedder backends.

CLAassistant · 2026-05-02T12:33:42Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Dmitry Pogodaev seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

markymark2001 · 2026-05-21T11:01:16Z

please merge this 🙏 🥺

kartik-mem0 · 2026-06-05T04:37:21Z

Closing — superseded by #5377, which covers both Python and TypeScript SDKs with a unified fix (always-on parallelism with concurrency cap, no opt-in flag needed). Thank you for the contribution!

markymark2001 mentioned this pull request May 21, 2026

Parallelize or batch entity boost searches in AsyncMemory.search #5214

Closed

markymark2001 mentioned this pull request May 22, 2026

feat(oss): add opt-in parallel entity boost for Python #5227

Closed

kartik-mem0 mentioned this pull request Jun 4, 2026

fix(oss): parallelize entity boost searches in Memory.search #5377

Merged

14 tasks

kartik-mem0 closed this Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(oss): opt-in parallel entity-boost in Memory.search() — 2-4x recall speedup#5046

feat(oss): opt-in parallel entity-boost in Memory.search() — 2-4x recall speedup#5046
DmitryPogodaev wants to merge 1 commit into
mem0ai:mainfrom
DmitryPogodaev:feat/parallel-entity-boost

DmitryPogodaev commented May 2, 2026

Uh oh!

CLAassistant commented May 2, 2026

Uh oh!

markymark2001 commented May 21, 2026

Uh oh!

kartik-mem0 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

DmitryPogodaev commented May 2, 2026

feat(oss): opt-in parallel entity-boost in Memory.search() — 2-4x recall speedup with remote embedders

Problem

Fix

Measurements

Files changed

Backward compatibility

Uh oh!

CLAassistant commented May 2, 2026

Uh oh!

markymark2001 commented May 21, 2026

Uh oh!

kartik-mem0 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants