> What will save it is that, no matter how picky you are as a creator, your audience will never know what exactly was that you dreamed up, so any half-decent approximation will work.
Part of the problem is the "half decent approximations" tend towards a clichéd average, the audience won't know that the cool cyberpunk cityscape you generated isn't exactly what you had in mind, but they will know that it looks like every other AI generated cyberpunk cityscape and mentally file your creation in the slop folder.
I think the pursuit of fidelity has made the models less creative over time, they make fewer glaring mistakes like giving people six fingers but their output is ever more homogenized and interchangable.
a somewhat counterintuitive argument is this: AI models will make the overall creative landscape more diverse and interesting, ie, less "average"!
Imagine the space of ideas as a circle, with stuff in the middle being more easy to reach (the "cliched average"). Previously, traversing the circle was incredibly hard - we had to use tools like DeviantArt, Instragram, etc to agglomerate the diverse tastes of artists, hoping to find or create the style we're looking for. Creating the same art style is hiring the artist. As a result, on average, what you see is the result of huge amounts of human curation, effort, and branding teams.
Now reduce the effort 1000x, and all of a sudden, it's incredibly easy to reach the edge of the circle (or closer to it). Sure, we might still miss some things at the very outer edge, but it's equivalent to building roads. Motorists appear, people with no time to sit down and spend 10000 hours to learn and master a particular style can simply remix art and create things wildly beyond their manual capabilities. As a result, the amount of content in the infosphere skyrockets, the tastemaking velocity accelerates, and you end up with a more interesting infosphere than you're used to.
> I think the pursuit of fidelity has made the models less creative over time, they make fewer glaring mistakes like giving people six fingers but their output is ever more homogenized and interchangeable.
That may be true of any one model (though I don’t think it really is, either, I think newer image gen models are individually capable of a much wider array of styles than earlier models), but it is pretty clearly not true of the whole range of available models, even if you look at a single model “family” like “SDXL derivatives”.
> I think the pursuit of fidelity has made the models less creative over time (...) their output is ever more homogenized and interchangable.
Ironically, we're long past that point with human creators, at least when it comes to movies and games.
Take sci-fi movies, compare modern ones to the ones from the tail end of the 20th century. Year by year, VFX gets more and more detailed (and expensive) - more and better lights, finer details on every material, more stuff moving and emitting lights, etc. But all that effort arguably killed immersion and believability, by making scenes incomprehensible. There's way too much visual noise in action scenes in particular - bullets and lighting bolts zip around, and all that detail just blurs together. Contrast the 20th century productions - textures weren't as refined, but you could at least tell who's shooting who and when.
Or take video games, where all that graphics works makes everything look the same. Especially games that go for realistic style, they're all homogenous these days, and it's all cheap plastic.
(Seriously, what the fuck went wrong here? All that talk, and research, and work into "physically based rendering", yet in the end, all PBR materials end up looking like painted plastic. Raytracing seems to help a bit when it comes to liquids, but it still can't seem to make metals look like metals and not Fischer-Price toys repainted to gray.)
So I guess in this way, more precision just makes the audience give up entirely.
> they will know that it looks like every other AI generated cyberpunk cityscape and mentally file your creation in the slop folder.
The answer here is the same as with human-produced slop: don't. People are good at spotting patterns, so keep adding those low-order bits until it's no longer obvious you're doing the same thing everyone else is.
EDIT: Also, obligatory reminder that generative models don't give you average of training data with some noise mixed up; they sample from learned distribution. Law of large numbers apply, but it just means that to get more creative output, you need to bias the sampling.
Empirically, we've passed the point where that's true, for someone not being lazy about it.
https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-ar...
In other words, someone willing to tweak the prompt and press the button enough times to say "yeah, that one, that's really good" is going to have a result which cannot in fact be reliably binned as AI-generated.