We already have that in the form of separate reasoning/thinking and speaking streams. Even with that it's awfully hard to get LLMs to keep it consistently concise. As soon as that context window starts growing it falls right back into verbosity without constant nudges back.