logoalt Hacker News

malfistyesterday at 8:00 PM3 repliesview on HN

I mean, if they've consumed all of human knowledge. What's left for them to train on? This pivot isn't only because it's cheaper and a way to juice the numbers for an IPO, it's survival because they can't improve more.


Replies

hastegyesterday at 11:52 PM

IIRC when they make a big enough architecture change to the model they will need to rerun pre training . So not like they’re feeding it more data (they will be but will be a drop in an s3 bucket compared to their dataset reserves) but rather training models with different architectures.

applicativeyesterday at 8:16 PM

It did sound to me like they feel some sort of wall coming.