There's plenty of evidence that good prompts (prompt engineering, tuning) can result in better outputs.
Improving LLM output through better inputs is neither an illusion, nor as easy as learning how to google (entire companies are being built around improving llm outputs and measuring that improvement)
Sure, but tricks & techniques that work with one model often don't translate or are actively harmful with others. Especially when you compare models from today and 6 or more months ago.
Keep in mind that the first reasoning model (o1) was released less than 8 months ago and Claude Code was released less than 6 months ago.