logoalt Hacker News

beernettoday at 8:25 AM0 repliesview on HN

No pelican? I don't believe it.

More seriously, LLM eval is totally broken judging by the related articles on HN.