I agree 99%.
The 1% where something else is better?
Youtube videos that show you how to access hidden fasteners on things you want to take apart.
Not that I can't get absolutely anything open, but sometimes it's nice to be able to do so with minimal damage.
I wonder if some day there will be a video codec that is essentially a standard distribution of a very precise and extremely fast text-to-video model (like SmartTurboDiffusion-2027 or something). Because surely there are limits to text, but even the example you gave does not seem to me to be beyond the reach of a text description, given a certain level of precision and capability in the model. And we now have faster than realtime text to video.