logoalt Hacker News

WhitneyLand10/11/20240 repliesview on HN

I agree it wasn’t that convincing, moreover the variation wasn’t that dramatic for the large sota models.

Why should they write a paper about the inherent reasoning capabilities for “large” language models and then in the abstract cherrypick a number that’s from a tiny 1B parameter model?