I feel like this time it is indeed in the training set, because it is too good to be true. Can you...

throwaw12 • today at 4:47 PM • 4 replies • view on HN

I feel like this time it is indeed in the training set, because it is too good to be true.

Can you run your other tests and see the difference?

simonw • today at 5:01 PM

It went pretty wild with "Generate an SVG of a NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER":

➕ show 1 reply

amelius • today at 9:39 PM

If I were them I'd run such requests through a diffusion model, and then try to distill an SVG out of that.

sifar • today at 9:37 PM

I think at this point we can safely put the pelican test in the category of Goodhart's law.

m3kw9 • today at 5:32 PM

if they cook these in, i wonder what else was cooked in there to make it look good.

➕ show 1 reply

alt Hacker News