logoalt Hacker News

simonwtoday at 4:43 PM21 repliesview on HN

Pretty great pelican: https://simonwillison.net/2026/Feb/19/gemini-31-pro/ - took over 5 minutes though, but I think that's because they're having performance teething problems on launch day.


Replies

embedding-shapetoday at 4:49 PM

It's an excellent demonstration of the main issue I have with the Gemini family of models, they always go "above and beyond" to do a lot of stuff, even if I explicitly prompt against it. In this case, most of the SVG ends up consisting not just of a bike and a pelican, but clouds, a sun, a hat on the pelican and so much more.

Exactly the same thing happens when you code, it's almost impossible to get Gemini to not do "helpful" drive-by-refactors, and it keeps adding code comments no matter what I say. Very frustrating experience overall.

show 11 replies
jasonjmcgheetoday at 6:54 PM

What's crazy is you've influenced them to spend real effort ensuring their model is good at generating animated svgs of animals operating vehicles.

The most absurd benchmaxxing.

https://x.com/jeffdean/status/2024525132266688757?s=46&t=ZjF...

show 7 replies
MrCheezetoday at 5:45 PM

Does anyone understand why LLMs have gotten so good at this? Their ability to generate accurate SVG shapes seems to greatly outshine what I would expect, given their mediocre spatial understanding in other contexts.

show 5 replies
sam_1421today at 5:02 PM

Models are soon going to start benchmaxxing generating SVGs of pelicans on bikes

show 4 replies
brikymtoday at 9:27 PM

Another great benchmark would be to convert a raster image of a logo into SVG. I've yet to find a good tool for this that produces accurate smooth lines.

culitoday at 8:48 PM

Cost per task has increased 4.2x but their ARC-AGI-2 score went from 33.6% to 77.1%

Cost per task is still significantly lower than Opus. Even Opus 4.5

https://arcprize.org/leaderboard

SoKamiltoday at 5:45 PM

It seems they trained the model to output good svg’s.

In their blog post[1], first use case they mention is svg generation. Thus, it might not be any indicator at all anymore.

[1] https://blog.google/innovation-and-ai/models-and-research/ge...

Arcurutoday at 4:50 PM

Did you stop using the more detailed prompt? I think you described it here: https://simonwillison.net/2025/Nov/18/gemini-3/

show 1 reply
WarmWashtoday at 5:09 PM

Less pretty and more practical, it's really good at outputting circuit designs as SVG schematics.

https://www.svgviewer.dev/s/dEdbH8Sw

show 2 replies
AmazingTurtletoday at 5:47 PM

At this point, the pelican benchmark became so widely used that there must be high quality pelicans in the dataset, I presume. What about generating an okapi on a bicycle instead?

show 2 replies
steve_adams_86today at 5:01 PM

Ugh, the gears and chain don't mesh and there's no sprocket on the rear hub

But seriously, I can't believe LLMs are able to one-shot a pelican on a bicycle this well. I wouldn't have guessed this was going to emerge as a capability from LLMs 6 years ago. I see why it does now, but... It still amazes me that they're so good at some things.

show 3 replies
bredrentoday at 4:46 PM

What is that, a snack in the basket?

show 3 replies
tarr11today at 6:16 PM

What do you think this particular prompt is evaluating for?

The more popular these particular evals are, the more likely the model will be trained for them.

show 1 reply
TZubiritoday at 8:05 PM

You think they are able to see their output and iterate on it? Or is it pure token generation?

infthitoday at 5:11 PM

Wonder when will we get something other than a side view

show 1 reply
calnytoday at 4:49 PM

Great pelican but what’s up with that fish in the basket?

show 3 replies
mohsen1today at 5:01 PM

is there something in your prompt about hats? why the pelican always wearing a hat recently?!

show 1 reply
xnxtoday at 4:49 PM

Not even animated? This is 2026.

show 1 reply
DonHopkinstoday at 6:47 PM

How about STL files for 3d printing pelicans!

show 1 reply
benatkintoday at 5:51 PM

I used the AI studio link and tried running it with the temperature set to 1.75: https://jsbin.com/locodaqovu/edit?html,output

saberiencetoday at 5:05 PM

I hope we keep beating this dead horse some more, I'm still not tired of it.