I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took r...

creamyhorror • yesterday at 7:48 PM • 5 replies • view on HN

I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.

It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.

Replies

torginus • yesterday at 11:23 PM

Honestly, while I'd like to believe you, there's always a post about how $MODEL+1 delivered powerful insights about the very nature of the universe in precise Hegelian dialectic, while $MODEL's output was indistinguishable from a pack of screeching sexually frustrated bonobos

dana321 • yesterday at 11:43 PM

5.4 very high didn't notice in my codebase a glaring issue that drops all data being sent around the network.

sampton • yesterday at 9:52 PM

That's been my experience as well switching from Opus to Codex. Reasoning takes longer but answers are precise. Claude is sloppy in comparison.

➕ show 2 replies

irishcoffee • yesterday at 9:19 PM

> It might be my AGENTS.md requiring clearer, simpler language

If you gave the exact same markdown file to me and I posted ed the exact same prompts as you, would I get the same results?

➕ show 2 replies

pembrook • yesterday at 10:40 PM

The latest research these days is that including an AGENTS.md file only makes outcomes worse with frontier models.

➕ show 5 replies

alt Hacker News

Replies