Google is using nVidia GPUs. More than that, I'd expect Google to still be something like 90% on nVidia GPUs. You can't really check of course. Maybe I'm an idiot and it's 50%.
But you can see how that works: go to colab.research.google.com. Type in some code ... "!nvidia-smi" for instance. Click on the down arrow next to "connect", and select change runtime type. 3 out of 5 GPU options are nVidia GPUs.
Frankly, unless you rewrite your models you don't really have a choice but using nVidia GPUs, thanks to, ironically, Facebook (authors of pytorch). There is pytorch/XLA automatic translation to TPU but it doesn't work for "big" models. And as a point of advice: you want stuff to work on TPUs? Do what Googlers do: use Jax ( https://github.com/jax-ml/jax ), oh, and look at the commit logs of that repository to get your mind blown btw.
In other words, Google rents out nVidia GPUs to their cloud customers (with the hardware physically present in Google datacenters).
> Frankly, unless you rewrite your models you don't really have a choice but using nVidia GPUs, thanks to, ironically, Facebook (authors of pytorch). There is pytorch/XLA automatic translation to TPU but it doesn't work for "big" models. And as a point of advice: you want stuff to work on TPUs?
I don't understand what you mean, most models aren't anywhere near big in terms of code complexity, once you have the efficient primitives to build on (like you have an efficient hardware-accerated matmul, backprop, flash attention, etc.) these models are in the sub-thousand LoC territory and you can even vibe-convert from one environment to another.
That's kind of a shock to realize how simple the logic behind LLMs is.
I still agree with you, Google is most likely still using Nvidia chips in addition to TPUs.