I haven't used SORA, but none of the GenAI I'm aware of could produce a competent comic book. When a human artist draws a character in a house in panel 1, they'll draw the same house in panel 2, not a procedurally generated different house for each image.
If a 60 year old grizzled detective is introduced in page 1, a human artist will draw the same grizzled detective in page 2, 3 and so on, not procedurally generate a new grizzled detective each time.
Btw there’s a way to match characters in a batch in the forge webUI which guarantees that all images in the batch have the same figure in it. Trivial to implement this in all other image generators. This critique is baseless.
A human artist keeps state :). They keep it between drawing sessions, and more importantly, they keep very detailed state - their imagination or interpretation of what the thing (house, grizzled detective, etc.) is.
Most models people currently use don't keep state between invocations, and whatever interpretation they make from provided context (e.g. reference image, previous frame) is surface level and doesn't translate well to output. This is akin to giving each panel in a comic to a different artist, and also telling them to sketch it out by their gut, without any deep analysis of prior work. It's a big limitation, alright, but researchers and practitioners are actively working to overcome it.
(Same applies to LLMs, too.)