logoalt Hacker News

hastegyesterday at 11:52 PM0 repliesview on HN

IIRC when they make a big enough architecture change to the model they will need to rerun pre training . So not like they’re feeding it more data (they will be but will be a drop in an s3 bucket compared to their dataset reserves) but rather training models with different architectures.