This feels so wrong. the LLM should play the role of a very general (but empty & un-opinionated) brain - you don’t want to perform a coding-specific lobotomy on someone every day. The proper target of their RL should have been their harness. That’s what determines the agent's trajectory as much as the base model.
I also wonder since they’re doing constant RL on model weights with today's Cursor design, does that mean they can never change their system prompt & other parts of the harness?
1) Comparison between past trajectories data would be meaningless if they were operating under different instructions.
2) Performance will be terrible the next time they change their tool design, since the model is now "opinionated" based on how a previous version of Cursor was designed.
Anthropic is more sensible with their “constitution” approach to safety. The behaviors (and ultimately the values) you want your model to follow should be a document, not a lobotomy.