logoalt Hacker News

anuramattoday at 6:49 PM1 replyview on HN

"some model I don't get to use is much better at benchmarks"

pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit


Replies

estearumtoday at 6:58 PM

So... you're not excited because it might take a few months before we can use it or something? I don't get your comment.

show 2 replies