logoalt Hacker News

pajtaitoday at 4:55 AM4 repliesview on HN

It cannot create an image of a pogo stick.

I was trying to get it to create an image of a tiger jumping on a pogo stick, which is way beyond its capabilities, but it cannot create an image of a pogo stick in isolation.


Replies

vunderbatoday at 6:02 AM

It's a tough test for local models - (gpt-image and NB had zero problems) - the only one that came reasonably close was Qwen-Image

Z-Image / Flux 2 / Hidream / Omnigen2 / Qwen Samples:

https://imgur.com/a/tB6YUSu

This is where smaller models are just going to be more constrained and will require additional prompting to coax out the physical description of a "pogo stick". I had similar issues when generating Alexander the Great leading a charge on a hippity-hop / space hopper.

nomeltoday at 7:13 AM

When given an image of an empty wine glass, it can't fill it to the brim with wine. The pogo stick drawers and wine glass fillers can enjoy their job security for months to come!

show 1 reply
mhl47today at 6:27 AM

You are right, just tried even with reference images it can't do it for me. Maybe with some good prompting.

Because in theory I would say that knowledge is something that does not have to be baked in the model but could be added using reference images if the model is capable enough to reason about them.

CamperBob2today at 5:09 AM

Those are both good benchmark prompts. Z-Image Turbo doesn't like them either:

Tiger on pogo stick: https://i.imgur.com/lnGfbjy.jpeg

Dunno what this is, but it's not a pogo stick: https://i.imgur.com/OmMiLzQ.jpeg

Nano Banana Pro FTW: https://i.imgur.com/6B7VBR9.jpeg