Interesting, but I think the article’s argument for the "bitter lesson" relies on logical fallacies. First, it misrepresents critics of scaling as dismissing compute entirely, then frames scaling and optimization as mutually exclusive strategies (which creates a false dilemma), ignoring their synergy. E.g. DeepSeek’s algorithmic innovations under export constraints augmented - and not replaced - the scaling efforts. The article also overgeneralizes from limited cases, asserting that compute will dominate the "post-training era" while overlooking potential disruptors like efficient architectures. The CEO's statements are barely suited to support its claims. A balanced view aligning with the "bitter lesson" should recognize that scaling general methods (e.g. learning algorithms) inherently requires both compute and innovation.