Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@
| Rust Edition | `2024`(要求 rustc 1.95+) |
| Registry Schema | `v36`(`src/registry/migrate.rs`) |
| MCP Tools | **71** 个(`src/mcp/tools/*.rs` 中 `pub struct Devkit*Tool`) |
| 测试函数 | **605** 个(`cargo test --workspace -- --list`) |
| Ignored 测试 | 7 个(`#[ignore]` 在 `src/` 6 个 + `crates/devbase-embedding` 1 个) |
| 测试函数 | **616** 个(`cargo test --workspace -- --list`) |
| Ignored 测试 | 6 个(`#[ignore]` 在 `src/`) |
| Workspace Crates | **12** 个(`crates/` 目录) |
| `src/main.rs` 行数 | 833 行(RF-4 限界 1000 行内) |
| Clippy | `-D warnings` / CI `-W warnings` |
Expand Down Expand Up @@ -108,7 +108,8 @@ devbase/
│ └── cli.rs # 11 个集成测试
├── benches/
│ ├── registry_bench.rs
│ └── semantic_index.rs
│ ├── semantic_index.rs
│ └── vault_bench.rs
├── skills/ # 示例 Skill(embed-repo / knowledge-report / search-workspace)
├── scripts/
│ ├── install.ps1 / install.sh
Expand Down Expand Up @@ -173,7 +174,7 @@ scripts/invariant-checks/run-checks.ps1
- **单元测试**:分布在 `src/**/tests.rs` 与 `#[cfg(test)]` 块中。
- **集成测试**:`tests/cli.rs`,使用 `assert_cmd` + `tempfile`,通过 `DEVBASE_DATA_DIR` 隔离数据目录。
- **Crate 测试**:每个 `crates/*/src/*.rs` 自带测试。
- **Bench**:`criterion` 驱动的 `benches/registry_bench.rs`、`benches/semantic_index.rs`。
- **Bench**:`criterion` 驱动的 `benches/registry_bench.rs`、`benches/semantic_index.rs`、`benches/vault_bench.rs`
- **测试隔离**:
- 所有 IO 测试使用 `TempDir` 与 `StorageBackend` 注入,禁止直接写 `%LOCALAPPDATA%`。
- `.cargo/config.toml` 默认 `RUST_TEST_THREADS=1`;CI 使用 `--test-threads=4`。
Expand Down
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Ontology 导入** — `devkit_ontology_import` MCP 工具(Beta tier),`devbase ontology` CLI(`--dry-run` 预览),支持 OpenClaw workspace `ontology/entities/*.json` + `ontology/relations/*.jsonl` 批量导入
- MCP 工具数: 69 → **71**(5 stable / 58 beta / 8 experimental)
- `devkit_document_convert` — Experimental tier MCP tool,PDF/PPTX → Markdown 转换(`pdftotext` / `python-pptx` 流水线),含 frontmatter 质量标注
- Stable 工具 invocation 测试补全:`devkit_query_repos`、`devkit_vault_search`、`devkit_vault_read`、`devkit_status`、`devkit_workflow_list`、`devkit_index`
- **Vault Tantivy 全文搜索** — `devkit_vault_search` 优先使用 BM25(`search_vault_at`),空结果/失败时回退到内存扫描;`run_index_with_progress` 在索引仓库时同步 `reindex_vault_with_writer` 写入同一 Tantivy 段
- **Criterion vault benchmark 基线** — 新增 `benches/vault_bench.rs`(`reindex_vault` 50/200 笔记,`search_vault` 单/多关键词 50/200 笔记),保存 baseline `v0.20.1`
- Stable 工具 invocation 测试补全:`devkit_query_repos`、`devkit_vault_search`(覆盖 Tantivy 路径)、`devkit_vault_read`、`devkit_status`、`devkit_workflow_list`、`devkit_index`
- `seed_repo()` 轻量测试 helper(仅插入 `entities` 表,无副作用)

### Fixed
Expand Down
4 changes: 4 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ harness = false
name = "registry_bench"
harness = false

[[bench]]
name = "vault_bench"
harness = false


[features]
default = ["tui", "mcp", "lang-rust", "lang-python", "lang-js-ts", "lang-go"]
Expand Down
20 changes: 5 additions & 15 deletions KNOWN_ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,28 +62,18 @@

**建议**:考虑使用宏或 derive 自动生成 `McpToolEnum` 和 `tier()`,减少 boilerplate。

### Vault 笔记全文搜索性能

**现状**:`devkit_vault_search` 在内存中对所有笔记做线性扫描 + 字符串匹配。

**影响**:Vault 笔记数量 >1000 时,搜索延迟可能超过 1s。

**建议**:为 Vault 内容建立 Tantivy 索引(复用现有 symbol_index 基础设施),或至少增加关键词索引表。

---

## P3 — 文档与可观测性

### 性能基准缺失

**现状**:Criterion 已列为 dev-dependency,但无实际 benchmark 套件。

**建议**:为 Index、Query、VaultSearch 建立 Criterion benchmarks,记录基线到 CI 产物。
当前无活跃 P3 债务。

## 已解决(归档)

| 问题 | 解决版本 | Commit |
|------|----------|--------|
| 问题 | 解决版本 | Commit / 实现 |
|------|----------|---------------|
| Vault 笔记全文搜索性能 | Unreleased | `devkit_vault_search` 优先使用 Tantivy BM25(`search_vault_at`),回退内存扫描;`reindex_vault_with_writer` 与仓库索引同 writer |
| 性能基准缺失 | Unreleased | 新增 `benches/vault_bench.rs`,保存 Criterion baseline `v0.20.1` |
| `relations` 表零生产读取路径 | v0.20.1 | `devkit_relation_store/query/delete` + `project_context` 读取 |
| Workflow 引擎零 MCP 暴露 | v0.20.1 | `devkit_workflow_list/run/status` |
| `project_context` 不完整 | v0.20.1 | 补充 `known_limits` + `skills` |
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
一套引擎,统一代码上下文、知识记忆与智能体推理。

[![Version](https://img.shields.io/badge/version-v0.20.1-blue)](https://github.com/juice094/devbase/releases)
[![Tests](https://img.shields.io/badge/tests-605%2B%20passed-brightgreen)](https://github.com/juice094/devbase/actions)
[![Tests](https://img.shields.io/badge/tests-616%2B%20passed-brightgreen)](https://github.com/juice094/devbase/actions)
[![Clippy](https://img.shields.io/badge/clippy-0%20warnings-green)](https://github.com/juice094/devbase/actions)
[![License](https://img.shields.io/badge/license-AGPL--3.0%20%2F%20Commercial-orange)](LICENSE)
[![Rust](https://img.shields.io/badge/rust-1.95%2B-9cf)](https://www.rust-lang.org)
Expand Down Expand Up @@ -36,7 +36,7 @@ devbase 将代码库、笔记与工作流编译为 AI 可推理的结构化情
| 📊 **TUI 仪表盘** | ratatui 终端界面:跨仓库搜索、安全同步、Skill/Workflow 发现 |
| 🔌 **71 个 MCP 工具** | stdio 本地进程通信:仓库管理、代码分析、知识图谱、智能体记忆 |
| 🏠 **本地优先** | 零数据离开本机 — SQLite + Tantivy + tree-sitter,无需云端 |
| 🔍 **混合检索** | BM25 全文 + FTS5 技能搜索 + 纯 SQL 向量搜索(`cosine_similarity` UDF),零 ML 运行时依赖 |
| 🔍 **混合检索** | BM25 全文(仓库 + Vault)+ FTS5 技能搜索 + 纯 SQL 向量搜索(`cosine_similarity` UDF),零 ML 运行时依赖 |

> [完整 71 个 Tool 矩阵 → docs/reference/mcp-tools.md](docs/reference/mcp-tools.md)

Expand Down
107 changes: 107 additions & 0 deletions benches/vault_bench.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
use criterion::{BenchmarkId, Criterion, black_box, criterion_group, criterion_main};
use devbase::registry::VaultNote;
use devbase::search::{commit_writer, get_writer, init_index_at, search_vault_at};
use devbase::vault::indexer::reindex_vault_core;

fn generate_notes(root: &std::path::Path, count: usize) -> Vec<VaultNote> {
let mut notes = Vec::with_capacity(count);
for i in 0..count {
let id = format!("note-{}.md", i);
let path = root.join(&id);
let title = format!("Vault Note {}", i);
let tags = if i % 3 == 0 {
vec!["rust".to_string(), "architecture".to_string()]
} else if i % 3 == 1 {
vec!["design".to_string()]
} else {
vec!["cli".to_string(), "rust".to_string()]
};
let body_tag = if i % 2 == 0 { "rust" } else { "architecture" };
let content = format!(
"---\ntitle: {}\ntags: [{}]\n---\n\n# {}\n\nThis is content for note {}. It discusses {} patterns and design decisions.\n",
title,
tags.join(", "),
title,
i,
body_tag
);
std::fs::write(&path, content).unwrap();
notes.push(VaultNote {
id,
path: path.to_string_lossy().to_string(),
title: Some(title),
content: String::new(),
frontmatter: None,
tags,
outgoing_links: vec![],
block_refs: vec![],
linked_repo: None,
created_at: chrono::Utc::now(),
updated_at: chrono::Utc::now(),
});
}
notes
}

fn bench_reindex_vault_core(c: &mut Criterion) {
let mut group = c.benchmark_group("reindex_vault");
group.sample_size(50);

for count in [50usize, 200usize] {
let notes_tmp = tempfile::tempdir().unwrap();
let notes = generate_notes(notes_tmp.path(), count);

let index_tmp = tempfile::tempdir().unwrap();
let (index, _reader) = init_index_at(index_tmp.path()).unwrap();
let mut writer = get_writer(&index).unwrap();
let schema = index.schema();

group.bench_with_input(BenchmarkId::from_parameter(count), &notes, |b, notes| {
b.iter(|| {
reindex_vault_core(notes, &mut writer, &schema).unwrap();
commit_writer(&mut writer).unwrap();
black_box(&writer);
});
});
}

group.finish();
}

fn bench_search_vault_at(c: &mut Criterion) {
let mut group = c.benchmark_group("search_vault");
group.sample_size(50);

for count in [50usize, 200usize] {
let notes_tmp = tempfile::tempdir().unwrap();
let notes = generate_notes(notes_tmp.path(), count);

let index_tmp = tempfile::tempdir().unwrap();
let index_path = index_tmp.path().to_path_buf();
let (index, _reader) = init_index_at(&index_path).unwrap();
let mut writer = get_writer(&index).unwrap();
let schema = index.schema();
reindex_vault_core(&notes, &mut writer, &schema).unwrap();
commit_writer(&mut writer).unwrap();
drop(writer);
drop(index);

for (label, query) in [("single", "rust"), ("multi", "rust design")] {
group.bench_with_input(
BenchmarkId::new(format!("{}_docs", count), label),
&(index_path.clone(), query),
|b, (path, q)| {
b.iter(|| {
let results = search_vault_at(path, q, 10).unwrap();
black_box(results);
});
},
);
}
}

group.finish();
}

criterion_group!(benches, bench_reindex_vault_core, bench_search_vault_at);
criterion_main!(benches);
4 changes: 2 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# devbase 文档导航

> **项目状态**:v0.20.1 · Schema v36 · 71 MCP tools · 605 tests · 12 workspace crates
> **项目状态**:v0.20.1 · Schema v36 · 71 MCP tools · 616 tests · 12 workspace crates
> **权威入口**:[`AGENTS.md`](../AGENTS.md)(Agent 环境指引)· [`CHANGELOG.md`](../CHANGELOG.md)(版本变更)
> **最后整理**:2026-06-13

Expand All @@ -12,7 +12,7 @@
|------|------|------|
| 版本 | v0.20.1 | `Cargo.toml` |
| Rust Edition | 2024 | `Cargo.toml` |
| 测试 | 605 passed / 0 failed / 7 ignored | `cargo test --workspace -- --list` |
| 测试 | 616 passed / 0 failed / 6 ignored | `cargo test --workspace -- --list` |
| Clippy | `-D warnings` 全绿 | CI |
| Schema | v36 | `src/registry/migrate.rs` |
| MCP Tools | **71**(5 Stable / 62 Beta / 4 Experimental) | `src/mcp/mod.rs` |
Expand Down
166 changes: 82 additions & 84 deletions docs/reference/stable-tools/vault_search.md
Original file line number Diff line number Diff line change
@@ -1,84 +1,82 @@
# devkit_vault_search

> **Tier**: Stable (frozen at v0.21.0)
> **Source**: `src/mcp/tools/vault.rs` — `DevkitVaultSearchTool`

Search the devbase Vault (Markdown notes) by keywords across note titles, tags, and full content.

## Purpose

- Find notes related to a topic, architecture decision, or project
- Discover linked concepts via tags or wikilinks
- Locate a note when you only remember fragments of its content
- Check if a topic has been documented before writing a new note

## When NOT to use

- Reading the full content of a known note → use `devkit_vault_read`
- Writing or updating notes → use `devkit_vault_write`
- Finding backlinks to a specific note → use `devkit_vault_backlinks`
- Searching across code repositories → use `devkit_query_repos`

## Input Schema

```json
{
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search keywords" }
},
"required": ["query"]
}
```

| Parameter | Type | Required | Default | Description |
|-----------|--------|----------|---------|--------------------------------|
| `query` | string | Yes | — | Space-separated keywords (AND) |

## Matching behavior

- All keywords must match (AND logic)
- Case-insensitive matching across:
- Note ID
- Note title
- Tags (comma-joined)
- Full Markdown body content
- No stemming or fuzzy matching — exact substring only

## Output Schema

```json
{
"success": true,
"count": 2,
"query": "mcp integration",
"notes": [
{
"id": "mcp-integration-guide",
"title": "MCP Integration Guide",
"path": "references/mcp-integration.md",
"tags": ["mcp", "integration", "architecture"]
}
]
}
```

| Field | Type | Description |
|---------|----------|------------------------------------------|
| `id` | string | Note identifier (usually filename stem) |
| `title` | string | Parsed from YAML frontmatter |
| `path` | string | Vault-relative file path |
| `tags` | string[] | Parsed from YAML frontmatter |

## Errors

| Error | Cause |
|--------------------|------------------------------------------|
| `query required` | Missing or empty `query` argument |
| Vault unreadable | Vault directory missing or permission denied |

## Changelog

| Version | Change |
|---------|------------------------------------------|
| v0.21.0 | Schema frozen as Stable |
# devkit_vault_search

> **Tier**: Stable (frozen at v0.21.0)
> **Source**: `src/mcp/tools/vault.rs` — `DevkitVaultSearchTool`

Search the devbase Vault (Markdown notes) by keywords across note titles, tags, and full content using Tantivy BM25 full-text search.

## Purpose

- Find notes related to a topic, architecture decision, or project
- Discover linked concepts via tags or wikilinks
- Locate a note when you only remember fragments of its content
- Check if a topic has been documented before writing a new note

## When NOT to use

- Reading the full content of a known note → use `devkit_vault_read`
- Writing or updating notes → use `devkit_vault_write`
- Finding backlinks to a specific note → use `devkit_vault_backlinks`
- Searching across code repositories → use `devkit_query_repos`

## Input Schema

```json
{
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search keywords" }
},
"required": ["query"]
}
```

| Parameter | Type | Required | Default | Description |
|-----------|--------|----------|---------|--------------------------------|
| `query` | string | Yes | — | Space-separated keywords (AND) |

## Matching behavior

- Primary path: Tantivy BM25 full-text search across note title, tags, and full Markdown body content.
- Fallback path: if the Tantivy index is empty/unavailable, case-insensitive substring scan across the same fields.
- All keywords must match (AND logic).
- Run `devbase vault reindex` or `devkit_index` to build/update the search index.

## Output Schema

```json
{
"success": true,
"count": 2,
"query": "mcp integration",
"notes": [
{
"id": "mcp-integration-guide",
"title": "MCP Integration Guide",
"path": "references/mcp-integration.md",
"tags": ["mcp", "integration", "architecture"]
}
]
}
```

| Field | Type | Description |
|---------|----------|------------------------------------------|
| `id` | string | Note identifier (usually filename stem) |
| `title` | string | Parsed from YAML frontmatter |
| `path` | string | Vault-relative file path |
| `tags` | string[] | Parsed from YAML frontmatter |

## Errors

| Error | Cause |
|--------------------|------------------------------------------|
| `query required` | Missing or empty `query` argument |
| Vault unreadable | Vault directory missing or permission denied |

## Changelog

| Version | Change |
|---------|------------------------------------------|
| Unreleased | Primary search path switched to Tantivy BM25; substring scan retained as fallback |
| v0.21.0 | Schema frozen as Stable |
Loading
Loading