Skip to content

fix(memory): parallelize entity boost searches in AsyncMemory.search#5298

Closed
Bartok9 wants to merge 1 commit into
mem0ai:mainfrom
Bartok9:fix/5214-parallelize-entity-boost
Closed

fix(memory): parallelize entity boost searches in AsyncMemory.search#5298
Bartok9 wants to merge 1 commit into
mem0ai:mainfrom
Bartok9:fix/5214-parallelize-entity-boost

Conversation

@Bartok9
Copy link
Copy Markdown
Contributor

@Bartok9 Bartok9 commented May 29, 2026

Summary

  • Run the per-entity embedding + entity-store searches in AsyncMemory._compute_entity_boosts_async concurrently instead of sequentially.
  • Cuts entity-boost latency from roughly the sum of all per-entity calls to roughly the slowest single call, with no change to scoring behaviour.

Motivation

Closes #5214.

AsyncMemory.search() can become slow when entity boost runs for multiple extracted entities. _compute_entity_boosts_async() processed up to 8 deduped entities in a for loop, awaiting each entity's embedding call and entity-store search before starting the next. With a remote embedding provider, wall time is roughly the sum of every per-entity call, which can push otherwise-normal searches into multi-second territory or request timeouts — even though the per-entity work is fully independent.

Fix

results = await asyncio.gather(
    *(_search_entity(entity_text) for _, entity_text in deduped),
    return_exceptions=True,
)
  • Each entity's embed + search is wrapped in a small _search_entity coroutine and dispatched via asyncio.gather, so the independent remote calls overlap.
  • return_exceptions=True keeps a single failing entity (provider timeout, throttle, transient 5xx) from aborting the others — the failure is logged at WARNING and that entity is skipped, matching the previous best-effort behaviour.
  • Scoring is unchanged: the same similarity * ENTITY_BOOST_WEIGHT * memory_count_weight math and max() aggregation run over the gathered results. Because max() is order-independent, concurrent completion produces identical boosts to the old sequential order.

Verification

python3 -m pytest tests/memory/test_main.py — 37 passed (3 new + 34 existing).

New tests in tests/memory/test_main.py::TestAsyncEntityBoostParallelism:

  • test_boosts_preserve_scoring — output matches the reference scoring math, including the max() aggregation when one memory is linked by two entities.
  • test_one_entity_failure_does_not_abort_others — a raising entity search is logged and skipped while the surviving entity still contributes a boost.
  • test_searches_run_concurrently — 4 sleeping searches finish in < 0.6 s (vs ~0.8 s sequential) and observed concurrency peak ≥ 2, proving the calls overlap.

Diff is scoped to 2 files (mem0/memory/main.py, tests/memory/test_main.py); the sync _compute_entity_boosts path is intentionally left untouched.

Closes mem0ai#5214

Problem: AsyncMemory._compute_entity_boosts_async processed up to 8 deduped
entities sequentially, awaiting each entity's embedding call and entity-store
search before starting the next. With a remote embedding provider the latency
is roughly the sum of all per-entity calls, making normal searches take several
seconds or hit request timeouts.

Fix: gather the per-entity embed+search work with asyncio.gather so the
independent calls run concurrently. Wall time drops from ~sum of all calls to
~slowest single call. return_exceptions=True keeps one failed entity from
aborting the others; failures are logged and skipped. Scoring is unchanged:
the max-aggregated boost math runs over gathered results, and max() is
order-independent so concurrency preserves identical output.

Verification: 3 new tests in tests/memory/test_main.py -
preserve-scoring (matches reference math), one-entity-failure-resilience,
and a concurrency proof (4 sleeping searches finish in <0.6s, not ~0.8s).
@kartik-mem0
Copy link
Copy Markdown
Contributor

Closing — superseded by #5377, which covers both Python and TypeScript SDKs with a unified fix (always-on parallelism with concurrency cap, no opt-in flag needed). Thank you for the contribution!

@kartik-mem0 kartik-mem0 closed this Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parallelize or batch entity boost searches in AsyncMemory.search

2 participants