Yes, short term this is right. But at some point PyTorch will have a model.toVHDL() method, and we'll have a PCBWAY-style website for tapeout of the circuit. Nvidia's future looks less bright than they think and their GPU market will certainly pop.
I can't imagine that model lifetimes will ever justify using model-specific ASICS for public serving (maybe something like serving fixed certified AI models in a vehicle or robot) over more generic GPUs/NPUs until after the AI bubble pops.
Doesn't that assume that VHDL is trivial? I feel like there are tons of performance tradeoffs or hardware designers wouldn't have jobs