logoalt Hacker News

Grosvenoryesterday at 7:48 PM2 repliesview on HN

Could this generate pressure to produce less memory hungry models?


Replies

hodgehog11yesterday at 8:07 PM

There has always been pressure to do so, but there are fundamental bottlenecks in performance when it comes to model size.

What I can think of is that there may be a push toward training for exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights. But this is likely to be much slower and come with initial performance costs that frontier model developers will not want to incur.

show 5 replies
lofaszvanittyesterday at 8:12 PM

Of course and then watch those companies reined in.