The intensity of competition between models is so intense right now they are definitely benchmaxxing pelican on bike SVGs and Will Smith spaghetti dinner videos.
There was Lenna for digital image compression (https://en.wikipedia.org/wiki/Lenna).
A pelican on a bike is SFW, inclusive, yet cool.
It is not a full benchmark - rather a litmus test.
So, again, when the indicator becomes a target, it stops being a good indicator.
You can just try other svgs, I got some pretty good ones.
(*Disclaimer: I work for Google, but also I have zero idea about what they trained deepthink on)
note that this benchmark aside, they've gotten really good at SVGs, I used to rely on the nounproject for icons, and sometimes various libraries, but now coding agents just synthesize an SVG tag in the code and draw all icons.
Parallel hypothesis: the intensity of competition between models is so intense that any high-engagement high-relevance web discussion about any LLM/AI generation is gonna hit the self-guided self-reinforced model training and result in de facto benchmaxxing.
Which is only to say: if we HN-front-page it, they will come (generate).