inference is often a static, bounded problem solvable by generic compilers. training requires the mature ecosystem and numerical stability of cuda to handle mixed-precision operations. unless you rewrite the software from the ground up like Google but for most companies it's cheaper and faster to buy NVIDIA hardware
> static, bounded problem
What does it even mean in neural net context?
> numerical stability
also nice to expand a bit.