Skip to content

refactor: decompose Rust build-pipeline process_file and match_js_type_map#1592

Merged
carlos-alm merged 5 commits into
mainfrom
refactor/titan-call-resolution-rust
Jun 18, 2026
Merged

refactor: decompose Rust build-pipeline process_file and match_js_type_map#1592
carlos-alm merged 5 commits into
mainfrom
refactor/titan-call-resolution-rust

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

Mirrors the TS call-resolution decomposition (PR #1591) for the Rust native engine. Decomposes the highest-complexity Rust functions in the build pipeline.

  • process_file: cog 73→14, bugs 3.72→0.69 (81% reduction each)
  • match_js_type_map: cog 122→thin dispatcher (~95% reduction)
  • write_dataflow in pipeline.rs: extracted focused helper
  • do_insert_nodes in insert_nodes.rs: decomposed

Titan Audit Context

Changes

  • crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rsprocess_file decomposed into focused sub-functions; match_js_type_map → thin dispatcher
  • crates/codegraph-core/src/extractors/javascript.rsmatch_js_type_map helpers extracted
  • crates/codegraph-core/src/domain/graph/builder/pipeline.rswrite_dataflow extracted
  • crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rsdo_insert_nodes decomposed

Metrics Impact

  • process_file (Rust): cog 73→14, bugs 3.72→0.69
  • match_js_type_map (Rust): cog 122→dispatcher pattern

Test plan

  • CI passes (lint + build + tests)
  • Native engine parity tests pass
  • No new functions above complexity FAIL thresholds (bugs > 1.0)

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR is a pure complexity-reduction refactoring of the Rust native build pipeline, mirroring an equivalent TypeScript decomposition in PR #1591. No logic changes are introduced — every extracted helper is a semantically faithful lift of code that previously lived inside a single large function body.

  • build_edges.rs: process_file (cog 73→14) is broken into a FileContext struct that consolidates per-file lookup maps, plus focused helpers build_type_map, build_pts_map_for_file, emit_no_receiver_pts_edges, and emit_receiver_pts_edges.
  • javascript.rs: match_js_type_map (cog 122→dispatcher) becomes a thin match that delegates to handle_var_declarator_type_map, handle_param_type_map, handle_assignment_type_map, and handle_field_def_type_map, each using idiomatic early-return guards instead of nested if let chains.
  • pipeline.rs / insert_nodes.rs: write_dataflow and do_insert_nodes are similarly split into single-responsibility helpers with clear doc comments; upsert_file_hashes gains an explicit early-return that replaces a deep if has_file_hashes { ... } block.

Confidence Score: 5/5

Safe to merge — every extracted function is a direct mechanical lift of existing logic with no behavioral changes; the Rust borrow checker enforces correctness of all lifetime-dependent refactors at compile time.

The diff is a straightforward decomposition: large function bodies are moved into named helpers, nested if-let chains are replaced with early-return guards, and a new FileContext struct bundles what were formerly scattered locals. Each refactored path was traced against the original and found to be semantically identical. No data flow is altered, no edge-case branches are dropped, and all transaction/commit sequencing in insert_nodes.rs is preserved.

No files require special attention; all four changed files follow the same mechanical extraction pattern and compile-time Rust guarantees cover the lifetime-sensitive parts in build_edges.rs.

Important Files Changed

Filename Overview
crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs Decomposes process_file (cog 73→14): introduces FileContext struct to hold per-file lookup maps, extracts build_file_context, build_type_map, build_pts_map_for_file, emit_no_receiver_pts_edges, and emit_receiver_pts_edges; all branches preserved faithfully
crates/codegraph-core/src/extractors/javascript.rs Converts match_js_type_map from a monolithic match arm into a thin dispatcher calling handle_var_declarator_type_map, handle_param_type_map, handle_assignment_type_map, and handle_field_def_type_map; logic is semantically identical with early-return guards replacing nested if-let chains
crates/codegraph-core/src/domain/graph/builder/pipeline.rs Extracts build_return_type_index and inject_return_types_for_file from propagate_return_types_across_files, and splits the write_dataflow loop body into write_dataflow_arg_flows, write_dataflow_assignments, and write_dataflow_mutations; behavior unchanged
crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs Decomposes do_insert_nodes into four focused helpers (insert_file_nodes, insert_symbol_nodes, upsert_node_batch, upsert_file_hashes); the early-return inversion for has_file_hashes and the removal of inner {} scope blocks are clean style improvements with equivalent semantics

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant BE as build_call_edges
    participant PF as process_file
    participant FC as build_file_context
    participant BT as build_type_map
    participant BP as build_pts_map_for_file
    participant NR as emit_no_receiver_pts_edges
    participant RR as emit_receiver_pts_edges

    BE->>PF: file_input, all_nodes
    PF->>FC: file_input, all_nodes
    FC->>BT: file_input
    BT-->>FC: type_map
    FC->>BP: file_input, imported_names
    BP-->>FC: "Option<pts_map>"
    FC-->>PF: FileContext
    loop for each call
        PF->>PF: find_enclosing_caller + resolve_call_targets
        alt targets.is_empty() and no receiver
            PF->>NR: fc, call, caller_id
        end
        alt targets.is_empty()
            PF->>RR: fc, call, caller_id
        end
    end
    PF->>PF: emit_hierarchy_edges

    participant PR as propagate_return_types
    participant BI as build_return_type_index
    participant IJ as inject_return_types_for_file

    PR->>BI: file_symbols
    BI-->>PR: (return_type_index, global_return_types)
    loop for each file with call_assignments
        PR->>IJ: rel_path, symbols, indexes
    end

    participant DN as do_insert_nodes
    participant IF as insert_file_nodes
    participant IS as insert_symbol_nodes
    participant UB as upsert_node_batch
    participant UH as upsert_file_hashes

    DN->>IF: tx, batches
    DN->>IS: tx, batches
    IS-->>DN: (contains_edges, param_of_edges)
    DN->>UB: tx, edges
    DN->>UH: tx, file_hashes, removed_files
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant BE as build_call_edges
    participant PF as process_file
    participant FC as build_file_context
    participant BT as build_type_map
    participant BP as build_pts_map_for_file
    participant NR as emit_no_receiver_pts_edges
    participant RR as emit_receiver_pts_edges

    BE->>PF: file_input, all_nodes
    PF->>FC: file_input, all_nodes
    FC->>BT: file_input
    BT-->>FC: type_map
    FC->>BP: file_input, imported_names
    BP-->>FC: "Option<pts_map>"
    FC-->>PF: FileContext
    loop for each call
        PF->>PF: find_enclosing_caller + resolve_call_targets
        alt targets.is_empty() and no receiver
            PF->>NR: fc, call, caller_id
        end
        alt targets.is_empty()
            PF->>RR: fc, call, caller_id
        end
    end
    PF->>PF: emit_hierarchy_edges

    participant PR as propagate_return_types
    participant BI as build_return_type_index
    participant IJ as inject_return_types_for_file

    PR->>BI: file_symbols
    BI-->>PR: (return_type_index, global_return_types)
    loop for each file with call_assignments
        PR->>IJ: rel_path, symbols, indexes
    end

    participant DN as do_insert_nodes
    participant IF as insert_file_nodes
    participant IS as insert_symbol_nodes
    participant UB as upsert_node_batch
    participant UH as upsert_file_hashes

    DN->>IF: tx, batches
    DN->>IS: tx, batches
    IS-->>DN: (contains_edges, param_of_edges)
    DN->>UB: tx, edges
    DN->>UH: tx, file_hashes, removed_files
Loading

Reviews (3): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile

Comment on lines +698 to +702
// Engine selection: CODEGRAPH_ENGINE env always wins over config-file value.
if (process.env.CODEGRAPH_ENGINE !== undefined) {
const val = process.env.CODEGRAPH_ENGINE as 'auto' | 'native' | 'wasm';
(config.build as Record<string, unknown>).engine = val;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The CODEGRAPH_ENGINE value is cast to the union type without any runtime validation. If the variable is set to an unrecognized value (e.g. CODEGRAPH_ENGINE=typo), it will be written to config undetected and could silently cause the engine-selection logic downstream to behave unexpectedly. The CODEGRAPH_FAST_SKIP_DIAG path uses an explicit === '1' check, which is a safer pattern for this kind of single-source-of-truth override.

Suggested change
// Engine selection: CODEGRAPH_ENGINE env always wins over config-file value.
if (process.env.CODEGRAPH_ENGINE !== undefined) {
const val = process.env.CODEGRAPH_ENGINE as 'auto' | 'native' | 'wasm';
(config.build as Record<string, unknown>).engine = val;
}
// Engine selection: CODEGRAPH_ENGINE env always wins over config-file value.
if (process.env.CODEGRAPH_ENGINE !== undefined) {
const val = process.env.CODEGRAPH_ENGINE;
if (val === 'auto' || val === 'native' || val === 'wasm') {
(config.build as Record<string, unknown>).engine = val;
} else {
debug(`applyEnvOverrides: ignoring unknown CODEGRAPH_ENGINE value "${val}" (expected auto|native|wasm)`);
}
}

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved — merged origin/main which brings in CODEGRAPH_ENGINE and CODEGRAPH_FAST_SKIP_DIAG test coverage from #1589. Additionally, the merge also brings in main's validation-with-warning for CODEGRAPH_ENGINE (the P2 finding itself): invalid values now log a warning and fall back to "auto" rather than being cast blindly.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

23 functions changed27 callers affected across 4 files

  • propagate_return_types_across_files in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1304 (4 transitive callers)
  • build_return_type_index in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1336 (5 transitive callers)
  • inject_return_types_for_file in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1378 (5 transitive callers)
  • write_dataflow in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1816 (2 transitive callers)
  • write_dataflow_arg_flows in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1872 (3 transitive callers)
  • write_dataflow_assignments in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1896 (3 transitive callers)
  • write_dataflow_mutations in crates/codegraph-core/src/domain/graph/builder/pipeline.rs:1928 (3 transitive callers)
  • build_type_map in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:463 (3 transitive callers)
  • build_pts_map_for_file in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:489 (3 transitive callers)
  • build_file_context in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:552 (17 transitive callers)
  • emit_no_receiver_pts_edges in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:600 (17 transitive callers)
  • emit_receiver_pts_edges in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:662 (17 transitive callers)
  • process_file in crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs:700 (16 transitive callers)
  • do_insert_nodes in crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs:97 (0 transitive callers)
  • insert_file_nodes in crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs:115 (1 transitive callers)
  • insert_symbol_nodes in crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs:192 (1 transitive callers)
  • upsert_node_batch in crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs:282 (1 transitive callers)
  • upsert_file_hashes in crates/codegraph-core/src/domain/graph/builder/stages/insert_nodes.rs:303 (1 transitive callers)
  • match_js_type_map in crates/codegraph-core/src/extractors/javascript.rs:114 (0 transitive callers)
  • handle_var_declarator_type_map in crates/codegraph-core/src/extractors/javascript.rs:137 (1 transitive callers)

Keep main's validation-with-warning for CODEGRAPH_ENGINE and the
detailed JSDoc for config.build.engine referencing issue #1596.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed Greptile finding: merged origin/main to pick up env-override test coverage for CODEGRAPH_ENGINE and CODEGRAPH_FAST_SKIP_DIAG (landed in #1589). The merge also brings in the validation-with-warning fix for CODEGRAPH_ENGINE (the P2 finding itself) — invalid values now warn and fall back to "auto". All 3126 tests pass locally.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 8b6ebe4 into main Jun 18, 2026
33 checks passed
@carlos-alm carlos-alm deleted the refactor/titan-call-resolution-rust branch June 18, 2026 00:07
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant