> even SOTA AIs of today are subhuman at highly agentic tasks and long-horizon tasks
This sounds like a lot of the work engineers do as well, we're not perfect at it (though execs aren't either), but the work you produce is expected to survive long term, thats why we spend time accounting for edge cases and so on.
Case in point; the popularity of docker/containerization. "It works on my machine" is generally fine in the short term, you can replicate the conditions of the local machine relatively easily, but doing that again and again becomes a problem, so we prepare for that (a long-horizon task) by using containers.