Looking at that example, I too feel it's a thought process I could go through once or twice. I think this highlights an important difference between humans and LLMs: a human can think such things explicitly in their head, or on paper (I do it in text editor more frequently than I'd care to admit), once or twice, and then it sticks, quickly becomes more of a "system 1" thing. With LLMs, the closest to that outside training/fine-tuning would probably be prompt caching. It would be great if we could figure out some kind of on-line learning scheme so the model could internalize its own thoughts and persist them between conversations, but in the latent space, not prepended as token input.