I’m waiting to see results on deepswe - that benchmark really seemed accurate for opus and gpt 5.5…

taf2 • last Tuesday at 11:49 PM • 0 replies • view on HN

alt Hacker News