> Also, as I said, information about the prompts quickly reveals competence / incompetence, and is crucial for management / business in hiring, promotions, managing token budgets, etc.
I fail to see why you would need that kind of information to find out if someone is not competent. This really sounds like an attempt at crazy micro-management.
The "distillation" that you want already exists in various forms: the commit message, the merge request description/comments, the code itself, etc.
Those can (and should) easily be reviewed.
Did you previously monitor which kind of web searches developpers where doing when working on a feature/bugfix? Or asked them to document all the thoughts that they had while doing so?