>Now they have to be lucky to be 6 months ahead to an open model with at most half the parameter ...

DeathArrow • today at 5:48 AM • 0 replies • view on HN

>Now they have to be lucky to be 6 months ahead to an open model with at most half the parameter count, trained on 1%-2% the hardware US models are trained on.

Maybe there's a limit in training and throwing more hardware at it does very little improvement?

alt Hacker News