Suggestion was even simpler - feed a reasoning model a prompt like “tell me a few reasons a user might’ve created this Cursor rule: {RULE_TEXT}”
Do that for a bunch of rules scraped from a bunch of repos - and you’ve got yourself a dataset for training a new model with - or maybe for fine tuning.
Yeah, go for it.