One of the images in the blog (https://images.ctfassets.net/kftzwdyauwt9/4d5dizAOajLfAXkGZ7...) is a carbon copy of an image from an article posted Mar 27, 2026 with credits given to an individual: https://www.cornellsun.com/article/2026/03/cornell-accepts-5...
Was this an oversight? Or did their new image generation model generate an image that was essentially a copy of an existing image?
That has to be the wrong stock image included or something, bloody hell.
magick image-l.webp image-r.jpg -compose difference -composite -auto-level -threshold 30% diff.png
It's practically all dark except for a few spots. It's the same image just different size compression whatever. I can't find it in any stock image search, though. Surely it could not have memorized the whole image at that fidelity. Maybe I just didn't search well enough.Given the recency of that image, it is unlikely it is in the training data and therefore I would go with oversight.
This is hilarious. Seems like kind of a random image for a model to memorize, but it could be.
There is definitely enough empirical validation that shows image models retain lots of original copies in their weights, despite how much AI boosters think otherwise. That said, it is often images that end up in the training set many times, and I would think it strange for this image to do that.
Regardless, great find.