This doesn't seem to really address synthetic data, let alone RL-based reasoning.
Why would it? Once those are introduced, advancement leaves behind pure scaling.
Why would it? Once those are introduced, advancement leaves behind pure scaling.