logoalt Hacker News

adrianNtoday at 10:58 AM0 repliesview on HN

That sounds intuitively true, but I’m not convinced that it is actually the case. I don’t think we know enough about neural network training to say what training and how many parameters are necessary for what kind of performance on which tasks. To me it looks like we currently guess that more is better and try to throw as much compute and data at the problem as is economically feasible. There is little incentive for companies to invest into small model research since their moat is huge models that require special hardware to run.