logoalt Hacker News

toddmoreytoday at 1:29 PM5 repliesview on HN

Confident idiot: I’m exploring using LLM for diagram creation.

I’ve found after about 3 prompts to edit an image with Gemini, it will respond randomly with an entirely new image. Another quirk is it will respond “here’s the image with those edits” with no edits made. It’s like a toaster that will catch on fire every eighth or ninth time.

I am not sure how to mitigate this behavior. I think maybe an LLM as a judge step with vision to evaluate the output before passing it on to the poor user.


Replies

codazodatoday at 6:26 PM

I had a similar result trying to create 16 similarly styled images. After half a dozen it just started kicking out the same image over and over again no matter what the prompt said. Even the “thinking” looked right, but the image was just a repeat. I don’t know if this is some type of context limitation or what.

I got around it by using a new prompt/context for each image. This required some rethinking about how to make them match. What I did was create a sprite sheet with the first prompt and then only replaced (edited) the second prompt.

I still got some consistency problems because there were a few important details left out of my sprite sheet. Next time I think I’ll create those individually and then attach them as context for additional prompts.

show 1 reply
RationPhantomstoday at 3:22 PM

Whats your thoughts on the diagram as code movement? I'd prefer to have an LLM utilize those as it can atleast drive some determinism through it rather than deal with the slippery layer that is prompt control for visual LLMs.

codingdavetoday at 4:52 PM

Have you considered that perhaps such things simply are not within its capabilities?

user34283today at 3:49 PM

Yes, same here.

I don't know if it's a fault with the model or just a bug in the Gemini app.

dominotwtoday at 4:36 PM

same. i gave it a very well hand drawn floor plan but never seems to be able to create a formal version of it. Its very very simple too.

makes hilarious mistakes like putting toilet right in the middle of living room.

I dont get all the hype. am i stupid.