logoalt Hacker News

jollymonATXyesterday at 4:19 PM0 repliesview on HN

According to the benchmarks, you are wrong. It is on track and slightly above some sota. Just the benchmarks speaking there, they can be/are gamed by all big model labs including domestic.