logoalt Hacker News

etermyesterday at 1:31 AM1 replyview on HN

The killer feature of LLMs is to be able to extrapolate what's really wanted from short descriptions.

Look again at Gemini's output, it looks like an actual book cover, it looks like an illustration that could be found on a book.

It takes on board corrections (albeit hilariously literaly).

Look at GPT image's output, it doesn't look anything like a book cover, and when prompted to say it got it wrong, just doubles down on what it was doing.


Replies

bongodongobobyesterday at 8:30 AM

What you want, and what you think image generation is, is impossible.