It strikes me as an Oder of operations problem.
If the system prompt is “flesh out template y with thought x” the form drives the generation, it feels compelled to use the whole template.
Of the system prompt is “refine thought x and then format it with the appropriate parts of template y” it becomes a simple transformation and formatting.
A lot of current gen llm pain appears to be being in the early days of understanding the nuance of system prompts.
Totally agree, I think it can get there.
If you present the idea as "an idea" rather than "my idea", it's pretty good at challenging, solidifying, and refining it. It still leans on the same forms in the writing part, but I think it can be tuned to not insist upon the idea's brilliance so breathlessly.