The standard objection: if the LLM is supposedly intelligent, why can’t it figure out on its own tha...

xigoi • today at 6:30 AM • 4 replies • view on HN

The standard objection: if the LLM is supposedly intelligent, why can’t it figure out on its own that this two-step process would achieve a better result?

Replies

jstanley • today at 6:43 AM

[flagged]

➕ show 1 reply

pyrolistical • today at 7:32 AM

You don’t know what you don’t know

nine_k • today at 6:32 AM

Nobody asked it to!

➕ show 1 reply

cubefox • today at 6:41 AM

Part of the problem is that it isn't the LLM making the image directly itself, it's the LLM repeatedly prompting edits for a separate edit diffusion model. The Gemini reasoning summary shows part of this. The style of some of the images makes it also clear that it uses an Imagen 4 derived diffusion model underneath.

alt Hacker News

Replies