Testing at these labs training big models must be wild, it must be so much work to train a "soul" into a model, run it in a lot of scenarios, the venn between the system prompts etc, see what works and what doesn't... I suppose try to guess what in the "soul source" is creating what effects as the plinko machine does it's thing, going back and doing that over and over... seems like it would be exciting and fun work but I wonder how much of this is still art vs science?
It's fun to see these little peaks into that world, as it implies to me they are getting really quite sophisticated about how these automatons are architected.
The most detail I've seen of this process is still from OpenAI's postmortem on their sycophantic GPT-4o update: https://openai.com/index/expanding-on-sycophancy/
The answer is "yes". To be really really good at training AIs, you need everyone.
Empirical scientists with good methodology who can set up good tests and benchmarks to make sure everyone else isn't flying blind. ML practitioners who can propose, implement and excruciatingly debug tweaks and new methods, and aren't afraid of seeing 9.5 out of 10 their approaches fail. Mechanistic interpretability researchers who can peer into model internals, figure out the practical limits and get rare but valuable glimpses of how LLMs do what they do. Data curation teams who select what data sources will be used for pre-training and SFT, what new data will be created or acquired and then fed into the training pipeline. Low level GPU specialists that can set up the infrastructure for the training runs and make sure that "works on my scale (3B test run)" doesn't go to shreds when you try a frontier scale LLM. AI-whisperers, mad but not too mad, who have experience with AIs, possess good intuitions about actual AI behavior, can spot odd behavioral changes, can get AIs to do what they want them to do, and can translate that strange knowledge to capabilities improved or pitfalls avoided.
Very few AI teams have all of that, let alone in good balance. But some try. Anthropic tries.