logoalt Hacker News

tlyesterday at 10:27 PM2 repliesview on HN

Per BalatroBench, gemini-3-pro-preview makes it to round (not ante) 19.3 ± 6.8 on the lowest difficulty on the deck aimed at new players. Round 24 is ante 8's final round. Per BalatroBench, this includes giving the LLM a strategy guide, which first-time players do not have. Gemini isn't even emitting legal moves 100% of the time.


Replies

raincoletoday at 1:41 AM

It beats ante eight 9 times out of 15 attempts. I do consider 60% winning chance very good for a first time player.

The average is only 19.3 rounds because there is a bugged run where Gemini beats round 6 but the game bugs out when it attempts to sell Invisible Joker (a valid move)[0]. That being said, Gemini made a big mistake in round 6 that would have costed it the run at higher difficulty.

[0]: given the existence of bugs like this, perhaps all the LLMs' performances are underestimated.