logoalt Hacker News

rgbrgbyesterday at 10:24 PM3 repliesview on HN

Notably it has 0 wins.


Replies

plaguuuuuuyesterday at 11:01 PM

Friendo, this is an anti-benchmark to figure out which AI is more likely to kill you.

If you point both at some github issues you can gauge their relative ability to solve problems.

show 1 reply
luipugsyesterday at 10:46 PM

"if you judge a fish by its ability to climb a tree" yada yada

show 1 reply
bel8yesterday at 10:38 PM

Not much less than GPT 5.4 with 2 wins or gemini-3.1-pro with 3 wins in 30 rounds.

Such is life in royal rumble games.