feat: add KGSearchRetrieval for full KG pipeline (N-hop, scoring, query_rewrite, community) by xugangqiang · Pull Request #15690 · infiniflow/ragflow

xugangqiang · 2026-06-05T05:16:25Z

Summary

KGSearchRetrieval composes entity search, type search, relation search, N-hop analysis, score fusion, LLM-based query_rewrite, and community reports into a single synthetic chunk for KG-enhanced retrieval.

Components

Component	Source	Status
Entity/relation/community search	Direct `DocEngine.Search` calls	✅
N-hop analysis + score fusion	`common.AnalyzeNHopPaths` / `DoubleHitBoost` / `FuseRelationScores`	✅ #15666
Query rewrite prompt + parser	`common.BuildQueryRewritePrompt` / `ParseQueryRewriteResponse`	✅ #15669
Token budget	`common.BuildKGContent` + `NumTokensFromString`	✅ #15666
LLM query rewrite integration	`queryRewrite` function with fallback	✅

Testing

11 tests (pure function + mock engine):

=== RUN   TestKgEntityFromChunk_Basic          --- PASS
=== RUN   TestKgEntityFromChunk_ScoreFallback  --- PASS
=== RUN   TestKgEntityFromChunk_MissingFields  --- PASS
=== RUN   TestKgRelationFromChunk_Basic        --- PASS
=== RUN   TestKgRelationFromChunk_MissingFrom  --- PASS
=== RUN   TestSearchKGTypeSamples_Success      --- PASS
=== RUN   TestSearchKGTypeSamples_Empty        --- PASS
=== RUN   TestKGSearchRetrieval_Basic          --- PASS
=== RUN   TestKGSearchRetrieval_NoEntities     --- PASS
=== RUN   TestQueryRewrite_Fallback            --- PASS
=== RUN   TestQueryRewrite_EmptyQuestion       --- PASS

coderabbitai · 2026-06-05T05:16:33Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds KG data types, chunk parsers, hybrid search expression builders, an LLM-backed query-rewrite, a configurable KGSearchPipeline orchestration, CSV/score formatting updates, expanded unit tests, and tenant nil-DAO guards.

Changes

Knowledge Graph Retrieval & Scoring

Layer / File(s)	Summary
KG types and CSV/score formatting `internal/service/kg_types.go`, `internal/service/kg_scoring_funcs.go`, `internal/service/kg_scoring_funcs_test.go`	Adds exported KG structs (`KGEntity`, `NhopEntity`, `KGRelation`, `Edge`, `EdgeScore`, `ScoredEntity`, `ScoredRelation`), `formatCSVLine`, `FilterChunksByScore`, JSON-based `extractDescription`, and CSV/formatting/token-budget test coverage.
Chunk parsers and search helpers `internal/service/kg_retrieval.go`	Adds `kgEntityFromChunk`, `kgRelationFromChunk`, `indexName` helper, `searchKGTypeSamples`, `searchKGCommunityContent`, hybrid expression builders (`buildMatchDenseExpr`, `buildFusionExpr`, `buildSearchExprs`), and `queryRewrite` LLM-backed rewrite with fallback.
KGSearchPipeline orchestration `internal/service/kg_pipeline.go`	New `KGSearchPipeline` type, options, `NewKGSearchPipeline`, and `Retrieval` which performs optional rewrite, concurrent entity/type and relation searches, N‑hop analysis, fusion/boosting, sorting/trimming, token‑budgeted content assembly, and returns composed chunk map.
Unit tests and test helpers `internal/service/kg_retrieval_test.go`, `internal/service/kg_scoring_funcs_test.go`	Adds mocks and spies (`mockRetrievalEngine`, `searchCaptureEngine`, `spyEmbedDriver`), many tests covering chunk parsing, relation parsing, type-sample decoding, buildSearchExprs with/without embedding, queryRewrite fallback, KGSearchRetrieval flows (including chat model path), community-content formatting, CSV quoting behavior, and FilterChunksByScore edge cases.
Tenant service guards and tests `internal/service/tenant.go`, `internal/service/tenant_test.go`	Adds nil-DAO checks in `RemoveMember` and `AcceptInvite` and updates tests to expect server error codes when the DAO is uninitialized.

Sequence Diagram

sequenceDiagram
  participant Client
  participant KGSearchPipeline
  participant LLM as QueryRewriteLLM
  participant DocEngine
  participant Scoring as KGScoring
  participant Community as CommunitySearch

  Client->>KGSearchPipeline: Retrieval(question, kbIDs, tenantIDs)
  KGSearchPipeline->>LLM: queryRewrite(question, type samples)
  LLM-->>KGSearchPipeline: typeKeywords, entities
  KGSearchPipeline->>DocEngine: Search(entities/types) concurrently
  DocEngine-->>KGSearchPipeline: entity chunks, relation chunks, type samples
  KGSearchPipeline->>Scoring: N-hop analysis & score fusion
  Scoring-->>KGSearchPipeline: scored entities & relations
  KGSearchPipeline->>Community: searchKGCommunityContent(entities, remainingTokens)
  Community-->>KGSearchPipeline: community report chunks
  KGSearchPipeline-->>Client: composed result map (content_with_weight, metadata)

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly Related PRs

infiniflow/ragflow#15689: Overlaps on KG chunk parsing and hybrid search expression builders.
infiniflow/ragflow#15669: Related to queryRewrite prompt/response parsing used by this PR.
infiniflow/ragflow#15666: Related KG scoring/formatting utilities consumed by the pipeline.

Suggested Reviewers

yingfeng
yuzhichang
JinHai-CN
qinling0210

Poem

🐰 I hop through nodes and edges bright,
I nudge a question into soft light,
I gather types, relations, and lore,
Stitch CSV lines and community score,
Your KG blooms — a rabbit’s delight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 54.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main feature being added: KGSearchRetrieval implementation composing a full KG pipeline with N-hop, scoring, query rewrite, and community components.
Description check	✅ Passed	The PR description provides a clear summary of the change, lists implemented components with status checks, and includes test results; however, it does not explicitly address the required template sections (problem statement and type of change).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

internal/service/kg_retrieval_test.go (1)

36-38: 💤 Low value

Unused mock code: entity_kwd filter branch never matches.

The mock checks for entity_kwd in req.Filter to build composite keys, but the real KGSearchRetrieval implementation places entities in MatchExprs rather than the Filter map (see context snippet 1, lines 95-102). None of the test cases use this branch, so it's effectively dead code.

♻️ Simplify the mock by removing the unused branch

 func (m *mockRetrievalEngine) Search(ctx context.Context, req *types.SearchRequest) (*types.SearchResult, error) {
 	kgType, _ := req.Filter["knowledge_graph_kwd"].(string)
-	key := kgType
-	if ents, ok := req.Filter["entity_kwd"].([]interface{}); ok && len(ents) > 0 {
-		key = kgType + ":" + ents[0].(string)
-	}
+	key := kgType
 	if r, ok := m.results[key]; ok {
 		return r, nil
 	}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_retrieval_test.go` around lines 36 - 38, Remove the dead
branch that checks req.Filter["entity_kwd"] in the mock used by the tests: the
real KGSearchRetrieval places entities in MatchExprs, so the mock's entity_kwd
branch never matches; update the mock (in kg_retrieval_test.go) to build
composite keys solely from req.MatchExprs (or remove the entity_kwd check and
unused variable key assembly) and simplify the mocked behavior in the function
that currently inspects req.Filter to reflect the real implementation's use of
MatchExprs.

internal/service/kg_retrieval.go (2)

136-144: 💤 Low value

Consider logging errors from supplementary searches.

Type search and relation search errors are silently ignored for graceful degradation, but this could make debugging difficult. Consider logging these errors at debug/warn level.

Example for type search

 	typesResult, err := docEngine.Search(ctx, typesReq)
+	if err != nil {
+		// Log at debug/warn level - continue with empty type results
+	}
 	entsFromTypes := make(map[string]struct{})
 	if err == nil {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_retrieval.go` around lines 136 - 144, The code currently
swallows errors from the supplementary type search (the call to docEngine.Search
with typesReq) making debugging hard; modify the block around typesResult, err
:= docEngine.Search(ctx, typesReq) to log the error (using the package logger or
processLogger/ambient logger used elsewhere) at debug or warn level when err !=
nil, but keep the existing graceful-degradation behavior (i.e., do not return
the error). Ensure the log includes context like the operation name ("type
search") and the request details (typesReq or a minimal identifying field) and
reference the entsFromTypes population logic so the flow remains unchanged when
the search fails.

31-31: 💤 Low value

Unused name parameter.

The name parameter is never used within the function body. At the call site (line 118), the entity name is extracted from chunk["entity_kwd"] and used as the map key. Consider removing this parameter to clarify the API.

♻️ Proposed fix

-func kgEntityFromChunk(name string, chunk map[string]interface{}) common.KGEntity {
+func kgEntityFromChunk(chunk map[string]interface{}) common.KGEntity {

And update the call site at line 118:

-		e := kgEntityFromChunk(name, chunk)
+		e := kgEntityFromChunk(chunk)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_retrieval.go` at line 31, Remove the unused name
parameter from kgEntityFromChunk and update its signature to func
kgEntityFromChunk(chunk map[string]interface{}) common.KGEntity; then update all
call sites that currently pass a name (notably the call that extracts entity
from chunk["entity_kwd"]) to pass only the chunk. Ensure any references inside
kgEntityFromChunk that expected name instead use chunk["entity_kwd"] (or the
existing extraction logic) and adjust unit tests or callers accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/service/kg_retrieval.go`:
- Line 184: Update the token limit variable maxToken from 8196 to the intended
8192 (the common 2^13 token boundary) in internal/service/kg_retrieval.go; if
8196 was intentional, instead add a brief inline comment next to maxToken
explaining the reason for the non-standard value so readers aren’t confused.
Ensure you modify the declaration of maxToken and include the explanatory
comment if you keep the 8196 value.
- Around line 99-106: KGSearchRetrieval currently only uses entities[0] for both
entity and relation matching; change entsReq.MatchExprs and relsReq.MatchExprs
to not ignore additional rewritten entities by replacing MatchingText:
entities[0] with a joined string of all entities (e.g., strings.Join(entities,
", ")) for the entity search (update where entsReq.MatchExprs is assembled) and
align the relation search behavior with Python by using the original raw
question text (the incoming query variable) for relation MatchingText in
relsReq.MatchExprs (or, if you prefer parity via entities, also use the joined
entities there); update the code in KGSearchRetrieval where entsReq.MatchExprs
and relsReq.MatchExprs are constructed to use these new values.

---

Nitpick comments:
In `@internal/service/kg_retrieval_test.go`:
- Around line 36-38: Remove the dead branch that checks req.Filter["entity_kwd"]
in the mock used by the tests: the real KGSearchRetrieval places entities in
MatchExprs, so the mock's entity_kwd branch never matches; update the mock (in
kg_retrieval_test.go) to build composite keys solely from req.MatchExprs (or
remove the entity_kwd check and unused variable key assembly) and simplify the
mocked behavior in the function that currently inspects req.Filter to reflect
the real implementation's use of MatchExprs.

In `@internal/service/kg_retrieval.go`:
- Around line 136-144: The code currently swallows errors from the supplementary
type search (the call to docEngine.Search with typesReq) making debugging hard;
modify the block around typesResult, err := docEngine.Search(ctx, typesReq) to
log the error (using the package logger or processLogger/ambient logger used
elsewhere) at debug or warn level when err != nil, but keep the existing
graceful-degradation behavior (i.e., do not return the error). Ensure the log
includes context like the operation name ("type search") and the request details
(typesReq or a minimal identifying field) and reference the entsFromTypes
population logic so the flow remains unchanged when the search fails.
- Line 31: Remove the unused name parameter from kgEntityFromChunk and update
its signature to func kgEntityFromChunk(chunk map[string]interface{})
common.KGEntity; then update all call sites that currently pass a name (notably
the call that extracts entity from chunk["entity_kwd"]) to pass only the chunk.
Ensure any references inside kgEntityFromChunk that expected name instead use
chunk["entity_kwd"] (or the existing extraction logic) and adjust unit tests or
callers accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2bebc9d9-dee3-4c0b-bf51-8a8e017947d1

📥 Commits

Reviewing files that changed from the base of the PR and between f6ff862 and 5d95d41.

📒 Files selected for processing (2)

internal/service/kg_retrieval.go
internal/service/kg_retrieval_test.go

coderabbitai · 2026-06-05T05:22:08Z

+	scoredRels := common.SortAndTrimRelations(relsFromText, 6)
+
+	// 9. Build KG content (entities + relations) with token budget, matching Python order
+	maxToken := 8196


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Likely typo: 8196 should be 8192.

The token limit 8196 appears to be a typo for 8192 (which is 2^13, a common token limit). If this is intentional, consider adding a comment explaining the choice.

🔧 Proposed fix

- maxToken := 8196 + maxToken := 8192

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

maxToken := 8196

maxToken := 8192

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/service/kg_retrieval.go` at line 184, Update the token limit variable maxToken from 8196 to the intended 8192 (the common 2^13 token boundary) in internal/service/kg_retrieval.go; if 8196 was intentional, instead add a brief inline comment next to maxToken explaining the reason for the non-standard value so readers aren’t confused. Ensure you modify the declaration of maxToken and include the explanatory comment if you keep the 8196 value.

coderabbitai

🧹 Nitpick comments (1)

internal/service/kg_retrieval.go (1)
138-146: 💤 Low value

Consider logging search errors for types and relations instead of silently ignoring.

Errors from docEngine.Search for types (line 138) and relations (line 164) are silently swallowed. While the fallback to empty maps is safe for correctness, this could hide transient failures or misconfigurations in production.

Adding debug-level logging would aid troubleshooting without changing behavior.

Also applies to: 164-174
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_retrieval.go` around lines 138 - 146, The
docEngine.Search error for types (call site producing typesResult and
entsFromTypes) and for relations (call site producing relationsResult and
entsFromRelations) is currently ignored; update both search error branches to
emit debug (or warn) logs including the error and contextual info (e.g., the
request object or which search failed) using the existing logger before falling
back to the empty map so behavior doesn't change—look for the calls to
docEngine.Search and add a processLogger.Debug/Warn (or the package logger) that
logs the error and a short message identifying "types search" or "relations
search".

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/service/kg_retrieval.go`:
- Around line 138-146: The docEngine.Search error for types (call site producing
typesResult and entsFromTypes) and for relations (call site producing
relationsResult and entsFromRelations) is currently ignored; update both search
error branches to emit debug (or warn) logs including the error and contextual
info (e.g., the request object or which search failed) using the existing logger
before falling back to the empty map so behavior doesn't change—look for the
calls to docEngine.Search and add a processLogger.Debug/Warn (or the package
logger) that logs the error and a short message identifying "types search" or
"relations search".

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f77c0bb-53e4-4857-8430-872fafe71df2

📥 Commits

Reviewing files that changed from the base of the PR and between 5d95d41 and 2856731.

📒 Files selected for processing (2)

internal/service/kg_retrieval.go
internal/service/kg_retrieval_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

internal/service/kg_retrieval_test.go

codecov · 2026-06-05T05:54:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.16%. Comparing base (f78ef32) to head (e49a1a5).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #15690   +/-   ##
=======================================
  Coverage   93.16%   93.16%           
=======================================
  Files          10       10           
  Lines         717      717           
  Branches      118      118           
=======================================
  Hits          668      668           
  Misses         29       29           
  Partials       20       20

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

This change eliminates the duplicate filtering implementation between service.ApplyMetaFilter and common.MetaFilter by removing the applySingleCondition adapter layer and consolidating all operator handling into common.MetaFilter. Key changes: - Remove applySingleCondition (65 lines of inline switch logic) - ApplyMetaFilter now converts MetaFilterCondition to MetaCondition and delegates directly to common.MetaFilter once per filter set - Add filterSet: O(n+m) hash-map fast path for in/not in operators (replaces O(n*m) linear scan in matchValue) - Export NormalizeOperator from common for operator alias mapping (includes ==, !=, >=, <=, is, not is aliases) Bug fixes: - contains/not contains: restored case-insensitive matching (ToLower) - not in: restored case-insensitive matching (EqualFold) - != with date filter: non-date metadata values now correctly match the != operator (they are not equal to the date) - Added == to operatorMapping so both ParseAndConvert and convertToMetaCondition normalize it to = Performance: - in/not in: O(n*m) linear scan -> O(n+m) hash-map lookup via filterSet - = operator delegates to common.MetaFilter (O(n) iteration for correctness/case-insensitivity, matching Python behavior) Cleanup: - Removed 18 lines of dead code (matchValue's in/not in branches bypassed by filterOut delegation) - Fixed orphaned godoc comment for convertOperator (reordered with NormalizeOperator) - Fixed filterSet doc comment: removed incorrect 'matching EqualFold' claim - Completed convertToMetaCondition operator normalization comment Tests: 60 tests (24 service + 36 common), all passing. New tests cover ==, !=, >=, <=, >, <, empty, not empty operators through ApplyMetaFilter; <, <=, !=, not-in-empty-list through MetaFilter/filterSet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

This file was accidentally included in the previous commit via git add internal/. The change was pre-existing in the working tree but unrelated to the metadata filter refactoring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

When f.Value is "A,,B" or "A, ,B", strings.Split produces empty or whitespace-only entries. These are now filtered out after TrimSpace to prevent unintended matches against empty-string metadata values. Add 5 test cases: - InEmptyParts: A,,B produces [A B] - InOnlyWhitespaceParts: A, , ,B produces [A B] - InAllEmptyParts: ,,, produces [] - NotInEmptyParts: A,,B produces [A B] - InMixedSpaces: spaces around values preserve trimmed values Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Combines entity/relation/community search with N-hop analysis, score fusion, and token-budgeted content assembly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…community second)

- Add queryRewrite function that calls BuildQueryRewritePrompt + ChatWithMessages + ParseQueryRewriteResponse - Fall back to raw question when chatModel is nil (backward compatible) - Remove extractTypeKeywords and extractEntities stubs - Update tests for fallback and empty question scenarios Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…eter

Use strings.Join(entities, " ") so multiple entity names are all passed to the ES search query, matching Python's behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- Extract KGSearchPipeline struct from monolithic KGSearchRetrieval - Parallelize three independent searches (entity/types/relations) with sync.WaitGroup - Move scoring/formatting/types from internal/common/ to internal/service/ - Fix nil ModelDriver panic in buildSearchExprs - Fix searchKGTypeSamples error discarded (add common.Warn) - Add FilterChunksByScore for post-search _score filtering (Python alignment) - Fix Embed input to use matchText.MatchingText instead of question - Add Python alignment constants (simThreshold=0.3, denseTopK=1024) - Fix community report format (JSON parse, Evidences section, numbering) - Add contract validation to mocks (IndexNames, KbIDs) - Convert fakeEmbedDriver stub to spyEmbedDriver with captured inputs - Fix relation test chunks missing _score - Fix bare type assertion in TestBuildSearchExprs_NoEmbModel - Add searchCaptureEngine IndexNames validation - Fix nil userTenantDAO panic in RemoveMember/AcceptInvite - Recover accidentally truncated task_test.go - 80+ tests covering all paths Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (3)

internal/service/kg_retrieval.go (1)
112-112: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify maxToken: 8196 — likely typo for 8192.

A past review comment flagged this as a probable typo. 8192 (2^13) is a standard token limit, while 8196 is unusual. If 8196 is intentional, add a comment explaining the +4 offset; otherwise, correct to 8192.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_retrieval.go` at line 112, The maxToken value set to 8196
on the KG retrieval configuration (symbol: maxToken) appears to be a typo;
change it to 8192 (the standard 2^13 token limit) unless the +4 is
intentional—if intentional, add an inline comment next to the maxToken
assignment explaining why the extra 4 tokens are required so future readers
understand the offset.
internal/service/kg_pipeline.go (1)
101-101: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify maxToken: 8196 — likely typo for 8192.

Same issue as in kg_retrieval.go:112. If 8196 is intentional, document the rationale; otherwise, correct to 8192 (2^13).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_pipeline.go` at line 101, The maxToken value is set to
8196 (symbol: maxToken) which appears to be a typo—change it to 8192 (2^13) or,
if 8196 was intentional, add a comment/docstring next to the maxToken assignment
explaining the rationale; also mirror the same correction or documentation as
done for kg_retrieval.go:112 so both places are consistent.
internal/service/kg_types.go (1)
34-38: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: KGRelation redeclared — compilation blocked.

The pipeline reports KGRelation redeclared in this block; other declaration in internal/service/kg_search.go:39:6. This is the same root cause as the KGEntity redeclaration above.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/kg_types.go` around lines 34 - 38, KGRelation (and the
earlier KGEntity) are defined twice which causes a redeclaration compile error;
remove the duplicate type definition from this file and reuse the single
canonical type instead (or merge any differing fields into one shared
definition). Concretely: delete the KGRelation declaration in this diff (or
reconcile its fields with the existing KGRelation/KGEntity declaration) so only
one type named KGRelation exists; update any local references to use that single
definition and ensure imports/builds compile against the unified type.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@internal/service/kg_pipeline.go`:
- Line 101: The maxToken value is set to 8196 (symbol: maxToken) which appears
to be a typo—change it to 8192 (2^13) or, if 8196 was intentional, add a
comment/docstring next to the maxToken assignment explaining the rationale; also
mirror the same correction or documentation as done for kg_retrieval.go:112 so
both places are consistent.

In `@internal/service/kg_retrieval.go`:
- Line 112: The maxToken value set to 8196 on the KG retrieval configuration
(symbol: maxToken) appears to be a typo; change it to 8192 (the standard 2^13
token limit) unless the +4 is intentional—if intentional, add an inline comment
next to the maxToken assignment explaining why the extra 4 tokens are required
so future readers understand the offset.

In `@internal/service/kg_types.go`:
- Around line 34-38: KGRelation (and the earlier KGEntity) are defined twice
which causes a redeclaration compile error; remove the duplicate type definition
from this file and reuse the single canonical type instead (or merge any
differing fields into one shared definition). Concretely: delete the KGRelation
declaration in this diff (or reconcile its fields with the existing
KGRelation/KGEntity declaration) so only one type named KGRelation exists;
update any local references to use that single definition and ensure
imports/builds compile against the unified type.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f0cd04be-97f6-4266-b44e-22da25057aad

📥 Commits

Reviewing files that changed from the base of the PR and between 8ba64b2 and e3ddf7d.

📒 Files selected for processing (8)

internal/service/kg_pipeline.go
internal/service/kg_retrieval.go
internal/service/kg_retrieval_test.go
internal/service/kg_scoring_funcs.go
internal/service/kg_scoring_funcs_test.go
internal/service/kg_types.go
internal/service/tenant.go
internal/service/tenant_test.go

coderabbitai

🧹 Nitpick comments (1)

internal/service/tenant.go (1)
793-795: ⚡ Quick win

Inconsistent nil-guard coverage across the service.

These nil guards for userTenantDAO are only present in RemoveMember and AcceptInvite, but at least 10 other methods in this service (GetTenantList, ListMembers, AddMember, etc.) call DAOs without similar guards. This creates an inconsistent defensive-programming pattern.

Consider one of these approaches:

Preferred (lower effort): Remove these guards and ensure TenantService is always properly constructed via NewTenantService() in both production and test code. Use proper mocking/initialization in tests rather than relying on nil checks.

Alternative: Apply nil guards consistently to all DAO/engine calls across the service if uninitialized service instances are a supported use case.

Partial nil-guard coverage creates maintenance confusion about which methods are "safe" to call on an uninitialized service.

Also applies to: 804-806
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/tenant.go` around lines 793 - 795, The two one-off nil
guards around userTenantDAO in RemoveMember and AcceptInvite create inconsistent
behavior; remove those guards and ensure TenantService is always constructed
with initialized DAOs via NewTenantService (set userTenantDAO and any other
DAOs/engines there) and update tests to use NewTenantService or explicitly
inject mocks (affects RemoveMember, AcceptInvite, GetTenantList, ListMembers,
AddMember and other DAO-calling methods); alternatively, if uninitialized
services must be supported, add the same nil-check pattern consistently to every
method that uses DAOs instead of leaving only these two checks.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/service/tenant.go`:
- Around line 793-795: The two one-off nil guards around userTenantDAO in
RemoveMember and AcceptInvite create inconsistent behavior; remove those guards
and ensure TenantService is always constructed with initialized DAOs via
NewTenantService (set userTenantDAO and any other DAOs/engines there) and update
tests to use NewTenantService or explicitly inject mocks (affects RemoveMember,
AcceptInvite, GetTenantList, ListMembers, AddMember and other DAO-calling
methods); alternatively, if uninitialized services must be supported, add the
same nil-check pattern consistently to every method that uses DAOs instead of
leaving only these two checks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b7a151fb-1d3e-434a-aa22-1dca184e91d0

📥 Commits

Reviewing files that changed from the base of the PR and between e3ddf7d and a0d0fe6.

📒 Files selected for processing (8)

internal/service/kg_pipeline.go
internal/service/kg_retrieval.go
internal/service/kg_retrieval_test.go
internal/service/kg_scoring_funcs.go
internal/service/kg_scoring_funcs_test.go
internal/service/kg_types.go
internal/service/tenant.go
internal/service/tenant_test.go

🚧 Files skipped from review as they are similar to previous changes (7)

internal/service/kg_types.go
internal/service/tenant_test.go
internal/service/kg_scoring_funcs.go
internal/service/kg_pipeline.go
internal/service/kg_scoring_funcs_test.go
internal/service/kg_retrieval.go
internal/service/kg_retrieval_test.go

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- Replace e.Sim with e.Similarity for KGEntity - Add NhopEntity, Edge, EdgeScore, ScoredEntity, ScoredRelation to kg_search.go - Add Sim, PageRank fields to KGRelation - Fix test field references Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…thods Replace three inline go func closures with named Pipeline methods. Reduces Retrieval() visual noise by ~80 lines while keeping the parallelism pattern explicit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

xugangqiang added the ci Continue Integration label Jun 5, 2026

xugangqiang marked this pull request as ready for review June 5, 2026 05:16

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. 💞 feature Feature request, pull request that fullfill a new feature. labels Jun 5, 2026

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 5, 2026

xugangqiang and others added 10 commits June 5, 2026 15:38

feat: add KGSearchRetrieval function for full KG pipeline

267a667

Combines entity/relation/community search with N-hop analysis, score fusion, and token-budgeted content assembly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix: remove dead idxnms and typeSamples code

4cf01d1

fix: reorder token budget to match Python (entities/relations first, …

2798d28

…community second)

refactor: replace queryRewrite callback with direct ty2entsJSON param…

df924c4

…eter

fix: search all entities from query_rewrite, not just entities[0]

1c2449e

Use strings.Join(entities, " ") so multiple entity names are all passed to the ES search query, matching Python's behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

xugangqiang force-pushed the feat/go-kg-retrieval branch from e3ddf7d to a0d0fe6 Compare June 5, 2026 07:43

xugangqiang marked this pull request as draft June 5, 2026 07:46

xugangqiang marked this pull request as ready for review June 5, 2026 07:46

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

xugangqiang and others added 3 commits June 5, 2026 15:56

fix: remove kg_types.go (types now in kg_search.go)

838165d

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 5, 2026

xugangqiang force-pushed the feat/go-kg-retrieval branch from 2b108fd to e49a1a5 Compare June 5, 2026 09:18

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jun 5, 2026

xugangqiang requested review from JinHai-CN, yingfeng and yuzhichang June 5, 2026 09:59

yingfeng merged commit ea79d65 into infiniflow:main Jun 5, 2026
2 of 3 checks passed

coderabbitai Bot mentioned this pull request Jun 5, 2026

feat: Dify-compatible retrieval API endpoint #15704

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add KGSearchRetrieval for full KG pipeline (N-hop, scoring, query_rewrite, community)#15690

feat: add KGSearchRetrieval for full KG pipeline (N-hop, scoring, query_rewrite, community)#15690
yingfeng merged 13 commits into
infiniflow:mainfrom
xugangqiang:feat/go-kg-retrieval

xugangqiang commented Jun 5, 2026

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Suggested Reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 5, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xugangqiang commented Jun 5, 2026

Summary

Components

Testing

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Suggested Reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

codecov Bot commented Jun 5, 2026 •

edited

Loading