This doesn’t solve the problem. The lethal trifecta as defined is not solvable and is misleading in terms of “just cut off a leg”. (Though firewalling is practically a decent bubble wrap solution).
But for truly sensitive work, you still have many non-obvious leaks.
Even in small requests the agent can encode secrets.
An AI agent that is misaligned will find leaks like this and many more.