logoalt Hacker News

dddgghhbbfblktoday at 1:47 AM1 replyview on HN

I think that's because the framing around this (and similar stories about eg IMO performances) is imo slightly wrong. It's not interesting that they can get a gold medal in the sense of trying to rank them against human competitors. As you say, the direct comparisons are, while not entirely meaningless, at least very hard to interpret in the best of cases. It's very much an apples to oranges situation.

Rather, the impressive thing is simply that an AI is capable of solving these problems at all. These are novel (ie not in training set) problems that are really hard and beyond the ability of most professional programmers. The "gold medal" part is informative more in the sense that it gives an indication of how many problems the AI was able to solve & how well it was able to do them.

When talking with some friends about chatgpt just a couple years ago I remember being very confident that there was no way this technology would be able to solve this kind of novel, very challenging reasoning problem, and that there was no way it would be able to solve IMO problems. It's remarkable how quickly I've been proven wrong.


Replies

ragequittahtoday at 3:30 AM

It feels like half of the people I see talk about AI are still under the impression it's a spicy autocomplete. If you use a SOTA model for a week and still feel this way your bias must be very strong.

show 1 reply