K-core based hierarchical community construction as an alternative to Leiden

### Do you need to file an issue?

- [x] I have searched the existing issues and this feature is not already filed.
- [x] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [x] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

### Is your feature request related to a problem? Please describe.

GraphRAG currently constructs its community hierarchy using Leiden clustering (graphrag/index/operations/cluster_graph.py). Leiden is stochastic (seed-dependent), and on large entity graphs the hierarchical community detection step can be a significant portion of indexing time.

In our research ("Core-based Hierarchies for Efficient GraphRAG"), we found that k-core decomposition can build the community hierarchy deterministically and more efficiently, while producing communities of comparable or better quality on the standard GraphRAG global-search benchmarks. There is currently no way to swap the community-detection strategy in GraphRAG without modifying core code.

### Describe the solution you'd like

Add k-core–based hierarchical community construction as an optional, pluggable alternative to Leiden, selectable via config and CLI. Leiden remains the default, so behavior is unchanged unless explicitly opted into.
    
Concretely:
    - A new operation kcore_cluster_graph() that peels the graph by k-core number, splits each level into size-bounded communities, and produces the similar Communities structure Leiden returns (so all downstream workflows are untouched).
    - Three heuristic variants from the paper: RkH (residual-aware k-core hierarchy), M2hC, and MRC.
    - A new community_algo field on ClusterGraphConfig (default "leiden") and a --community flag on graphrag index.
    - Branch in create_communities to dispatch to Leiden vs. k-core based on that config.
    
I intend to implement this myself. The work is already prototyped as a fork of GraphRAG v2.7.0, available here: https://github.com/erdemUB/KDD26. I will port it onto the latest main and open a PR with tests and a semversioner change doc. I'd like maintainer input on the preferred extension point before I submit: i.e. whether you'd prefer a config-string switch as above, or a more formal pluggable "clustering strategy" interface.

### Additional context

- This work has been accepted at KDD'26. Paper: Core-based Hierarchies for Efficient GraphRAG — https://arxiv.org/pdf/2603.05207
- The change is small and localized: one new operation file plus a config field, a CLI flag, and a dispatch branch. The new path reuses the existing create_graph, stable_largest_connected_component, and the entire downstream community-report/summarization pipeline unchanged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K-core based hierarchical community construction as an alternative to Leiden #2407

Do you need to file an issue?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

K-core based hierarchical community construction as an alternative to Leiden #2407

Description

Do you need to file an issue?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions