For some reason everything is 2x (2x cost, 2x avg response time, 2x reasoning and output tokens)...

XCSme • today at 5:59 PM • 0 replies • view on HN

Double-checking my test harness, but it's the first model that does this, so I doubt the issue is on my side...

EDIT: Harness seems correct, for straight coding tasks they perform identical: https://i.snipboard.io/5xbpzY.jpg

alt Hacker News