logoalt Hacker News

gilfaethwyyesterday at 7:37 PM0 repliesview on HN

I'm concerned that developing better metacognition is really just throwing more finite resources at the problem. We surely don't have unlimited compute, or unlimited (V)RAM, and so there must be a wall here. If it could be demonstrated that this improved metacognition was coming without associated increases in resource utilization, I would find these improvements to be much more convincing... but as things stand, we're very much not there.

(There may be an argument here re: the move from dense to MoE models, but all research I am aware of suggests that MoE models are not dramatically more efficient than dense models - i.e., active parameter count is not the overriding factor, and total parameter count is still extremely important, though it does seem to roughly follow a power law.)