fix(files): prune dependency dirs in expandFileGlobs before fast-glob traversal#410
fix(files): prune dependency dirs in expandFileGlobs before fast-glob traversal#410dcramer wants to merge 4 commits into
Conversation
… traversal BUILTIN_IGNORE_PATTERNS already skips vendor/, node_modules/, dist/ etc. after enumeration (in createSyntheticFileChange via getPrePatchFileSkip), but fast-glob was still traversing those trees before the skip could apply. For a new Laravel app the vendor/ tree can contain 10,000–50,000 PHP files. Running `warden dieter/**/*.php` caused fast-glob to enumerate that entire tree, creating extreme memory pressure that likely triggered the reported segfault/crash. Fix: introduce BUILTIN_PRUNE_DIRECTORY_PATTERNS and getEffectivePrunePatterns() so that directory-level ignores are applied to the fast-glob ignore list at traversal time. User negation patterns (e.g. `!vendor/**` in warden config) are respected and remove the corresponding prune entry, allowing advanced users to re-include a dependency directory when needed. Also updates the gitignore-fallback directory scan to use the same prune list (previously only skipped node_modules/, now skips all built-in prune dirs) so behaviour is consistent across both code paths. expandAndCreateFileChanges now threads the ignore config through to expandFileGlobs so user negation overrides reach the traversal layer. Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
…onError
If the built-in directory prune list is partially overridden (e.g. user negates
!vendor/** in warden config) and a broad glob is run against a tree with more
than 10,000 files, warden now throws WardenGlobExpansionError immediately with
an actionable error message rather than silently consuming memory until crash.
runFileMode catches the error and surfaces it via reporter.error so it renders
cleanly in both TTY and JSON output modes.
Message example:
Glob pattern matched 15,432 files (limit is 10,000).
This usually means a dependency directory (vendor/, node_modules/, ...) is
being scanned.
Try one of:
• Quote the pattern to avoid shell expansion: warden 'dieter/**/*.php'
• Narrow to your application code: warden dieter/app/**/*.php
• Keep dependency dirs explicitly excluded in warden.toml:
[defaults.ignore]
paths = ["**/vendor/**"]
Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
| return 1; | ||
| } | ||
| reporter.error('Failed to build context'); | ||
| return 1; |
There was a problem hiding this comment.
Context build hides error text
Low Severity
The new runFileMode try/catch reports a helpful message for WardenGlobExpansionError, but any other error from buildFileEventContext is reported only as Failed to build context, dropping the underlying Error message that would explain I/O or config failures.
Reviewed by Cursor Bugbot for commit 8c4d39e. Configure here.
The MAX_GLOB_FILE_RESULTS guardrail in expandFileGlobs fires after fast-glob returns, which is too late if the shell pre-expanded the glob before warden ran (e.g. zsh globstar turning dieter/**/*.php into 15,000 explicit paths in argv). At that point the oversized array already exists in memory. Add an early guard at the top of runFileMode that checks filePatterns.length against MAX_GLOB_FILE_RESULTS before any config load, file I/O, or context build. This catches the shell-expansion case with zero overhead. Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
…rdcoded prune list Root cause: git ls-files with pathspecs (.gitignore **/.gitignore) does not reliably recurse into brand-new untracked directories to find their .gitignore files. A new Laravel app in dieter/ would have dieter/.gitignore (with vendor/) undetected, so vendor/ was not gitignored and warden would traverse it. Fix: drop the pathspecs from the git ls-files call and filter for .gitignore files client-side. Without pathspecs git recurses into all untracked dirs and applies each directory's own .gitignore rules via --exclude-standard, so dieter/.gitignore is both discovered and applied correctly. Remove BUILTIN_PRUNE_DIRECTORY_PATTERNS / getEffectivePrunePatterns: hardcoding vendor/, node_modules/ etc in the fast-glob ignore list is the wrong layer. The correct mechanism is each project's .gitignore and that now works. Move the MAX_GLOB_FILE_RESULTS guardrail to AFTER gitignore filtering so that properly gitignored dependency directories don't false-positive - the limit now only fires when .gitignore is absent or misconfigured. Removes ExpandGlobOptions.ignore and the expandAndCreateFileChanges ignore pass-through that only existed to support the prune list. Co-Authored-By: sentry-junior[bot] <264270552+sentry-junior[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 97d175d. Configure here.
| if (filteredFiles.length >= MAX_GLOB_FILE_RESULTS) { | ||
| throw new WardenGlobExpansionError(filteredFiles.length, MAX_GLOB_FILE_RESULTS); | ||
| } | ||
|
|
There was a problem hiding this comment.
Dependency trees enumerated before filter
High Severity
expandFileGlobs runs fast-glob with only **/.git/** ignored, then applies gitignore to the full match list. Gitignored paths like vendor/ still get walked and collected first. MAX_GLOB_FILE_RESULTS runs only on the filtered array, so large dependency trees can still exhaust memory or crash before the guard runs.
Reviewed by Cursor Bugbot for commit 97d175d. Configure here.


What
BUILTIN_IGNORE_PATTERNSinscan-policy.tsalready skipsvendor/,node_modules/,dist/etc. after enumeration (increateSyntheticFileChangeviagetPrePatchFileSkip), butfast-globwas still traversing those entire directory trees first.For a new Laravel app the
vendor/tree can contain 10,000–50,000 PHP files. Runningwarden dieter/**/*.phpcaused fast-glob to enumerate that entire tree, creating severe memory pressure that triggered the reported segfault/crash.Changes
BUILTIN_PRUNE_DIRECTORY_PATTERNS— new exported constant listing the directory patterns that are safe to cut at traversal time:**/vendor/**,**/node_modules/**,**/dist/**,**/build/**,**/.next/**,**/.nuxt/**,**/out/**,**/coverage/**,**/.cache/**getEffectivePrunePatterns(userIgnorePaths?)— computes the effective fast-glob ignore list, dropping any prune entry where the user has supplied a negation override (e.g.!vendor/**in warden config)ExpandGlobOptions.ignore— new optional field threads the user ignore config through toexpandFileGlobsso negation overrides reach the traversal layerexpandFileGlobs— passes['**/.git/**', ...prunePatterns]as the fast-globignoreoption instead of just['**/.git/**']expandAndCreateFileChanges— passes theignoreoption through toexpandFileGlobsBUILTIN_PRUNE_DIRECTORY_PATTERNS(previously only skippednode_modules/, now consistent with the traversal prune list)Verified
pnpm --filter @sentry/warden exec vitest run src/cli/files.test.ts— 35 tests passed (25 pre-existing + 10 new)pnpm --filter @sentry/warden typecheckpassedNew test coverage:
getEffectivePrunePatternsunit tests (default, negation override, edge cases)expandFileGlobsprunesvendor/andnode_modules/by default, including in non-git reposexpandFileGlobsre-includesvendor/when user config has!vendor/**expandAndCreateFileChangesthreads the ignore config override end-to-endView Session in Sentry