logoalt Hacker News

reactordevtoday at 1:08 PM1 replyview on HN

I built one into my agent using sqlite…


Replies

itissidtoday at 1:27 PM

Especially for indie users/devs and smaller teams. I built a part of this(the retriever) in < 4 hours https://github.com/itissid/wiki for replacing deepwiki.

I think the challenge is to teach how ranking works to people more effectively so that they can build it for themselves and host them on their own.

Like the other day someone who has worked in search explained to me why you would care about using learning-to-rank(LTR) technique to train your own feature vector weights on your data. My understanding is that weighted features work better(retreival wise) on textual data than plain BM-25 and vector embedding db indexing of text chunks of your data with minimal preprocessing. So if you have lots of conversations you can create a ton of features(like attributes of a conversation) from it and ones that matter more will rank higher. And you can use a regularization(like L1) to kill unimportant ones.

[EDIT]: IIUC, I think LTR is important because you likely want different features to matter more for different parts of your documents, e.g. what matters for codebase documentation is different from your personal journal.

show 1 reply