Next generation of OS should have constant video and audio recognition by on device LLM. This will provide valuable context for a lot of scenarios. So instead of frequent copy-pasting we are used to, we can let agents access context of our whole workflows from different apps.
But Google is a very ill positioned candidate for such OS. I would rather trust Apple and local-first on-device models.
Next generation OS should absolutely -not- have always-on surveillance like you describe.