logoalt Hacker News

epolanskiyesterday at 3:50 PM2 repliesview on HN

The last thing a proper benchmark should do is reveal it's own API key.


Replies

sejjeyesterday at 4:17 PM

That's a good thought I hadn't had, actually.

plagiaristyesterday at 4:58 PM

IMO it should need a third party running the LLM anyway. Otherwise the evaluated company could notice they're receiving the same requests daily and discover benchmarking that way.

show 2 replies