It cannot create an image of a pogo stick. I was trying to get it to create an image of a tiger ju...

pajtai • today at 4:55 AM • 4 replies • view on HN

It cannot create an image of a pogo stick.

I was trying to get it to create an image of a tiger jumping on a pogo stick, which is way beyond its capabilities, but it cannot create an image of a pogo stick in isolation.

Replies

vunderba • today at 6:02 AM

It's a tough test for local models - (gpt-image and NB had zero problems) - the only one that came reasonably close was Qwen-Image

Z-Image / Flux 2 / Hidream / Omnigen2 / Qwen Samples:

https://imgur.com/a/tB6YUSu

This is where smaller models are just going to be more constrained and will require additional prompting to coax out the physical description of a "pogo stick". I had similar issues when generating Alexander the Great leading a charge on a hippity-hop / space hopper.

nomel • today at 7:13 AM

When given an image of an empty wine glass, it can't fill it to the brim with wine. The pogo stick drawers and wine glass fillers can enjoy their job security for months to come!

➕ show 1 reply

mhl47 • today at 6:27 AM

You are right, just tried even with reference images it can't do it for me. Maybe with some good prompting.

Because in theory I would say that knowledge is something that does not have to be baked in the model but could be added using reference images if the model is capable enough to reason about them.

CamperBob2 • today at 5:09 AM

Those are both good benchmark prompts. Z-Image Turbo doesn't like them either:

Tiger on pogo stick: https://i.imgur.com/lnGfbjy.jpeg

Dunno what this is, but it's not a pogo stick: https://i.imgur.com/OmMiLzQ.jpeg

Nano Banana Pro FTW: https://i.imgur.com/6B7VBR9.jpeg

alt Hacker News

Replies