logoalt Hacker News

helloplanetstoday at 7:47 AM1 replyview on HN

> Both conditions used GitHub Copilot (Claude Sonnet 4.5 or Haiku 4.5, depending on study) running in VS Code within isolated Docker containers. The only difference was Mouse tool availability. (https://hic-ai.com/papers/mouse-paper-v13.pdf)

Haiku/Sonnet 4.5 on GitHub Copilot is not a valid comparison whatsoever.

You need to benchmark against Claude Code running Opus. I mean, being revolutionary is a big claim to fame.


Replies

handfuloflighttoday at 7:53 AM

I guess this is what is meant by AI psychosis?

show 1 reply