> main issue seemed to be delay from what it saw with screenshots and api data and changing course.
This is where I think Taalas-style hardware AI may dominate in the future, especially for vehicle/plane autopilot, even it can't update weights. But determinism is actually a good thing.
This is a limitation of LLM i/o which historically is a bit slow due to these sequential user vs assistant chat prompt formats they still train on. But in principle nothing stops you from feeding/retrieving realtime full duplex input/output from a transformer architecture. It will just get slower as you scale to billions or even trillions of parameters, to the point where running it in the cloud might offer faster end-to-end actions than running it locally. What I could imagine is a small local model running everyday tasks and a big remote model tuning in for messy situations where a remote human might have to take over otherwise.