I agree it wasn’t that convincing, moreover the variation wasn’t that dramatic for the large sota mo...

WhitneyLand • 10/11/2024 • 0 replies • view on HN

I agree it wasn’t that convincing, moreover the variation wasn’t that dramatic for the large sota models.

Why should they write a paper about the inherent reasoning capabilities for “large” language models and then in the abstract cherrypick a number that’s from a tiny 1B parameter model?

alt Hacker News