> AI/LLMs are very good at filling in all those little details so humans can cherry pick the parts that they like.
Where did you find AI/ML that are good at filling in actual required and consistent details.
I beg of you to watch Annie Atkins' presentation I linked: https://www.youtube.com/watch?v=SzGvEYSzHf4 and tell me how much intervention would AI/ML need to create all that, and be consistent throughout the movie?
> once these models can generate coherent scenes, people can start using them to explore the creative space and figure out what they like.
Define "coherent scene" and "explore". A scene must be both coherent and consistent, and conform to the overall style of the movie and...
Even such a simple thing as shot/reverse shot requires about a million various details and can be shot in a million different ways. Here's an exploration of just shot/reverse shot: https://www.youtube.com/watch?v=5UE3jz_O_EM
All those are coherent scenes, but the coherence comes from a million decisions: from lighting, camera position, lens choice, wardrobe, what surrounds the characters, what's happening in the background, makeup... There's no coherence without all these choices made beforehand.
Around 4:00 mark: "Think about how well you know this woman just from her clothes, and workspace". Now watch that scene. And then read its description in the script https://imsdb.com/scripts/No-Country-for-Old-Men.html:
--- start quote ---
Chigurh enters. Old plywood paneling, gunmetal desk, litter
of papers. A window air-conditioner works hard.
A fifty-year-old woman with a cast-iron hairdo sits behind
the desk.
--- end quote ---And right after that there's a section on the rhythm of editing. Another piece in the puzzle of coherence in a scene.
> Then once I've got enough good generated images collected out of the tons of garbage, I fine tune a model and create a workflow that more consistently gives me those styles.
So, literally what I wrote here: https://news.ycombinator.com/item?id=42375280 :)