logoalt Hacker News

scotty79today at 5:18 PM0 repliesview on HN

I think models from one year ago with proper harness should be easily beating humans at this task on average. Human CEOs decisions are worse than random chance.