logoalt Hacker News

impossibleforktoday at 6:59 PM0 repliesview on HN

Yeah, but if its final performance comes from being trained with data from a bigger model one can question whether it's a way to build genuinely new 40B models.