logoalt Hacker News

ollieproyesterday at 6:40 PM1 replyview on HN

Do you have a better way to measure LLMs? Measurement implies quantitative evaluation... which is the same as benchmarks.


Replies

Wowfunhappyyesterday at 8:28 PM

I don’t have a good way to measure them, but I think they should be evaluated more like how we evaluate movies, or restaurants. Namely, experienced critics try them and write reviews.