Nvidia bought groq, so they might be working on their own answer to low-latency serving. (I found this good explanation of groq compared to TPU [1])
[1] https://reddit.com/r/LocalLLaMA/comments/1pw8nfk/nvidia_acqu...