> Today's SOTA LLMs have pretty excellent following of these divisions
Unfortunately "pretty excellent" is different from "perfect." I haven't kept track, but are you certain that given all possible inputs, the user prompt will never override the system prompt?
Those are strong claims, and unless there's been an advancement in the tech, it doesn't seem possible. Reinforcement learning might make it much less likely, but that's different from impossible.