We are basically already there, with HomeKit plus open bridge that can make any device visible on HomeKit (like nest cameras) and usable in automations (it works the other way also, it’s just a good way to get compatibility).
I would like to explore some open source solutions though, it would mean setting up a local system on a Mac mini for speech recognition and local processing. Bonus is you could use Anthony Daniels (KITT) as your assistant voice.