logoalt Hacker News

lossoloyesterday at 7:33 PM1 replyview on HN

I wrote about it around a year ago here:

"There weren't really any advancements from around 2018. The majority of the 'advancements' were in the amount of parameters, training data, and its applications. What was the GPT-3 to ChatGPT transition? It involved fine-tuning, using specifically crafted training data. What changed from GPT-3 to GPT-4? It was the increase in the number of parameters, improved training data, and the addition of another modality. From GPT-4 to GPT-40? There was more optimization and the introduction of a new modality. The only thing left that could further improve models is to add one more modality, which could be video or other sensory inputs, along with some optimization and more parameters. We are approaching diminishing returns." [1]

10 months ago around o1 release:

"It's because there is nothing novel here from an architectural point of view. Again, the secret sauce is only in the training data. O1 seems like a variant of RLRF https://arxiv.org/abs/2403.14238

Soon you will see similar models from competitors." [2]

Winter is coming.

1. https://news.ycombinator.com/item?id=40624112

2. https://news.ycombinator.com/item?id=41526039


Replies

toleranceyesterday at 7:42 PM

And when winter does arrive, then what? The technology is slowing down while its popularity picks up. Can sparks fly out of snow?

show 2 replies