logoalt Hacker News

XCSmeyesterday at 10:49 PM0 repliesview on HN

It doesn't do so well on my stupid benchmarks, lol: https://aibenchy.com

Gets wrong some tests. It does answer correctly, BUT it doesn't respect the request to respond ONLY with the answer, it keeps adding extra explanations at the end.