> Both conditions used GitHub Copilot (Claude Sonnet 4.5 or Haiku 4.5, depending on study) running in VS Code within isolated Docker containers. The only difference was Mouse tool availability. (https://hic-ai.com/papers/mouse-paper-v13.pdf)
Haiku/Sonnet 4.5 on GitHub Copilot is not a valid comparison whatsoever.
You need to benchmark against Claude Code running Opus. I mean, being revolutionary is a big claim to fame.
I guess this is what is meant by AI psychosis?