This is Apple's bet, among others.
Training purpose-specific miniature models lets you have a lot of tasks you can run with high confidence on consumer hardware.
Or on a commodity EC2 instance with a relatively cheap inference sidecar.
Or on a commodity EC2 instance with a relatively cheap inference sidecar.