logoalt Hacker News

warthoglast Tuesday at 9:02 PM1 replyview on HN

Can you explain how you achieve this in more detail? Did not see any in-detail explanation in the readme in repo


Replies

marcoaapforteslast Tuesday at 9:13 PM

Fair point the README focuses more on benchmarks than implementation here's the short version:

1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s) 2. Parse each file path into components (folders, filename, extension) 3. Score each file based on how query terms match path components, weighted by position and depth 4. Return top N matches sorted by score

The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.

I should add an architecture doc to the repo, thanks for the nudge.