In fact, one of the tests I use as part of GenAI Showdown involves both parts of the puzzle: draw a...

vunderba • yesterday at 8:55 PM • 1 reply • view on HN

In fact, one of the tests I use as part of GenAI Showdown involves both parts of the puzzle: draw a maze with a clearly defined entrance and exit, along with a dashed line indicating the solution to the maze.

Only one model (gpt-image-1) out of the 18 tested managed to pass the test successfully. Gemini 3.0 Pro got VERY close.

https://genai-showdown.specr.net/#the-labyrinth

Replies

danielvaughn • yesterday at 9:01 PM

super cool! Interesting note about Seedream 4 - do you think awareness of A* actually could improve the outcome? Like I said, I'm no AI expert, so my intuitions are pretty bad, but I'd suspect that image analysis + algorithmic pathfinding don't have much crossover in terms of training capabilities. But I could be wrong!

➕ show 1 reply

alt Hacker News

Replies