When it comes to the evals for this kind of thing, is there a standard set of test data out there th...

urbandw311er • today at 7:06 PM • 0 replies • view on HN

When it comes to the evals for this kind of thing, is there a standard set of test data out there that one can work with to benchmark against? ie a collection of documents with questions that should result in particular documents or chunks being cited as the most relevant match.

alt Hacker News