And with the pipeline
Voice -> free text -> LLM -> standardized JSON -> call API to do stuff.
The only “hard” part in 2025 is the LLM. Everything after that is what I call a “10000 monkeys problem”. Just throw some developers at it.