Fair point the README focuses more on benchmarks than implementation here's the short version:
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s)
2. Parse each file path into components (folders, filename, extension)
3. Score each file based on how query terms match path components, weighted by position and depth
4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.
Fair point the README focuses more on benchmarks than implementation here's the short version:
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s) 2. Parse each file path into components (folders, filename, extension) 3. Score each file based on how query terms match path components, weighted by position and depth 4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.