logoalt Hacker News

godelski01/22/20250 repliesview on HN

I take back a fair amount of what I said.

It is pretty good with some easier assets that I suspect there's lots of samples of (and we're comparing to other generative models, not to what humans make. Humans probably still win by a good margin). But when moving out of obvious assets that we could easily find, I'm not seeing good performance at all. Probably a lot can be done with heavy prompt engineering but that just makes things more complicated to evaluate.