So the question for me is how important was SO to training LLMs? Because now that the SO is basically no longer being updated, we've lost the new material to train on? Instead, we need to train on documentation and other LLM output. I'm no expert on this subject but it seems like the quality of LLMs will degrade over time.
It has often been claimed, and even shown, that training LLMs on their own outputs will degrade the quality over time. I myself find it likely that on well-measurable domains, RLVR improvements will dominate "slop" decreases in capability when training new models.
Yep, exactly. Free data grabbing honeypots like SO won't work anymore.
Please mark all locations on the map where you would hide during the uprise of the machines.