This strikes me as a pretty weak rationalization for "safe" always-on assistants. Even if the model runs locally, there’s still a serious privacy issue: Unwitting victims of something recording everything they said.
Friends at your house who value their privacy probably won’t feel great knowing you’ve potentially got a transcript of things they said just because they were in the room. Sure, it's still better than also sending everything up to OpenAI, but that doesn’t make it harmless or less creepy.
Unless you’ve got super-reliable speaker diarization and can truly ensure only opted-in voices are processed, it’s hard to see how any always-listening setup ever sits well with people who value their privacy.
We give an overview of our the current memory architecture at https://juno-labs.com/blogs/building-memory-for-an-always-on...
This is something we call out under the "What we got wrong" section. We're currently collecting an audio dataset that should help create a speech-to-text (STT) model that incorporates speaker identification and that tag will be weaved into the core of the memory architecture.
> The shared household memory pool creates privacy situations we’re still working through. The current design has everyone in the family shares the same memory corpus. Should a child be able to see a memory their parents created? Our current answer is to deliberately tune the memory extraction to be household-wide with no per-person scoping because a kitchen device hears everyone equally. But “deliberately chose” doesn’t mean “solved.” We’re hoping our in-house STT will allow us to do per-person memory tagging and then we can experiment with scoping memories to certain people or groups of people in the household.
I wonder if the answer is that it is stored and processes in a way that a human can’t access or read, like somehow it’s encrypted and unreadable but tokenized and can be processed, I don’t know how but it feels possible.