Iterations are the missing link.
With ChatGPT, you can iteratively improve text (e.g., "make it shorter," "mention xyz"). However, for pictures (and video), this functionality is not yet available. If you could prompt iteratively (e.g., "generate a red car in the sunset," "make it a muscle car," "place it on a hill," "show it from the side so the sun shines through the windshield"), the tools would become exponentially more useful.