> to decide halfway through following my detailed instructions that it would be "simpler&quo...

storus • yesterday at 8:40 PM • 1 reply • view on HN

> to decide halfway through following my detailed instructions that it would be "simpler" to just... not do what I asked

That's likely coming from the 3:1 ratio of linear to quadratic attention usage. The latest DeepSeek also suffers from it which the original R1 never exhibited.

Replies

nl • today at 6:08 AM

There is no way you can diagnose this like that. Correlation isn't causation and much more likely is a common source of reinforcement training data.

alt Hacker News

Replies