> Not sure why you jumped there.
No jump.
Your original claim: "Submit a whole script the way a writer delivers a movie to a director. The (automated) director/DP/editor could maintain internal visual coherence, while the script drives the story coherence."
Two comments later it's this: "We can already get detailed style guidance into picture generation. Declaring you want Picasso cubist, Warner brothers cartoon, or hyper realistic works today. So does lighting instructions, color palettes, on and on."
I just re-wrote this with respect to movies.
> I was thinking more like ‘make it look like Bladerunner if Kurosawa directed it, with a score like Zimmer.’
Because, as we all know, every single movie by Kurosawa is the same, as is every single score by Hans Zimmer, so it's ridiculously easy to recreate any movie in that style, with that music.
> You’re really failing to let go of the idea that you need to prescribe every little thing. Like Midjourney today, you’ll be able to give general guidance.
Yes, and Midjounrey today really sucks at:
- being consistent
- creating proper consistent details
A general prompt will give you a general result that is usually very far from what you actually have in mind.
And yes, you will have to prescribe a lot of small things if you want your movie to be consistent. And for your movie to make any sense.
Again, tell me how exactly your amazing magical AI director will know which wardrobe to chose, which camera angles to setup, which typography to use, which sound effects to make just from the script you hand in?
you can start ,with a very simple scene I referenced in my original reply: two people talking at the table in Whiplash.
> But paint by numbers stuff like many movies already are? A Hallmark Channel weepy? I bet we will.
Even those movies have more details and more care than you can get out of AIs (now, or in foreseeable future)
> Again, tell me how exactly your amazing magical AI director will know which wardrobe to chose, which camera angles to setup, which typography to use, which sound effects to make just from the script you hand in?
I think you're still assuming I always want to choose those things. That's why we're talking past each other. A good movie making model would choose for me unless I give explicit directions. Today we don't see long-range coherence in the results of movie (or game engine) models, but the range is increasing, and I'm willing to bet we will see movie-length coherence in the next decade or so.
By the way, I also bet that if I pasted exactly the No Country for Old Men script scene description from up this thread into Midjourney today it would produce at least some compelling images with decent choices of wardrobe, lighting, set dressing, camera angle, exposure, etc etc. That's what these models do, because they're extrapolating and interpolating between the billion images they've seen that contained these human choices.
AFAIK Midjourney produces single images, so the relevant scope of consistency is inside the single image only. Not between images. A movie model needs coherence across ~160,000 images, which is beyond the state of the art today but I don't see why it's impossible or unreasonable in the long run.
> A general prompt will give you a general result that is usually very far from what you actually have in mind.
Which is only a problem if I have something in mind. Alternatively I can give no guidance, or loose guidance, make half a dozen variations, pick the one I like best. Maybe iterate a couple of times into that variation tree. Just like the image generators do.