Is it though? Do we still have the expectation that LLMs will eventually be able to solve problems t...

Deegy • yesterday at 7:46 PM • 1 reply • view on HN

Is it though? Do we still have the expectation that LLMs will eventually be able to solve problems they haven't seen before? Or do we just want the most accurate auto complete at the cheapest price at this point?

Replies

sdenton4 • yesterday at 11:10 PM

It indicates that there's a good chance that they have trained on the test set, making the eval scores useless. Even if you have given up on the dream of generalization entirely, you can't meaningfully compare models which have trained on test to those which have not.

alt Hacker News

Replies