logoalt Hacker News

Bnjorogetoday at 3:12 PM0 repliesview on HN

I personally dont put any weight to DeepSWE. Other than 5.5 being directionally the best model, it gets the others pretty wrong in my experience. FrontierCode from cognition looks interesting