logoalt Hacker News

Chu4eenoyesterday at 7:12 PM0 repliesview on HN

That's because of posttraining optimizing for benchmarks that test that.

They tend to collapse into nonsense and hallucinations pretty quickly if you move slightly out of the envelope of the current visual reasoning benchmaxxing.