Good catch, there was an issue with the second hardest thing in programming (caching). Here's...

michaelbuckbee • today at 3:36 PM • 3 replies • view on HN

Good catch, there was an issue with the second hardest thing in programming (caching).

Here's an updated eval with the proper models https://a3bmfqfom3.evvl.io/

Replies

Is it me or did GPT get noticeably more natural in word choice recently? You can see it between 4.1 and 5.5 here, but I'm not sure when that happened. (My guess would be one of the recent 5.x releases.)

Edit: I meant specifically the absence of bizarre phrasing. That seems to have improved.

reissbaker • today at 6:24 PM

Wow, I'm surprised. Grok 4.3 actually is noticeably better than the other two for the close-friend variant. Surprisingly I found Claude the cringiest of the three!

wamatt • today at 4:05 PM

Thanks from where I'm looking Grok 4.3 and Claude 4.7 do a better job on the informal close friend/coworker vibe.

ChatGPT sounds fake / formal phrasing (for the specific close friend context) and has em-dashes and uses capitalization. Hence, ChatGPT does not, imo grok the assignment ;)

alt Hacker News

Replies