> AMD is so behind NVidia that it's not even funny.
Do you really want all AI hardware and software dominated by a monopoly? We're not looking to "beat" Nvidia, we are looking to offer a compelling alternative. MI300x is compelling. MI355x is even more compelling.
If there is another company out there making a compelling product, send them my way!
AMD's hardware might be compelling if it had good software support, but it doesn't. CUDA regularly breaks when I try to use Tensorflow on NVIDIA hardware already. Running a poorly-implemented clone of CUDA where even getting Pytorch running is a small miracle is going to be a hard sell.
All AMD had to do was support open standards. They could have added OpenCL/SYCL/Vulkan Compute backends to Tensorflow and Pytorch and covered 80% of ML use cases. Instead of differentiating themselves with actual working software, they decided to become an inferior copy of NVIDIA.
I recently switched from Tensorflow to Tinygrad for personal projects and haven't looked back. The performance is similar to Tensorflow with JIT [0]. The difference is that instead of spending 5 hours fixing things when NVIDIA's proprietary kernel modules update or I need a new box, it actually Just Works when I do "pip install tinygrad".
0: https://cprimozic.net/notes/posts/machine-learning-benchmark...
Time will tell, no? Transmeta shipped a lot of Crusoes. It was run by brilliant people. It was a “compelling alternative.” Maybe Cerebras is the Transmeta of this race, I don’t know. But. It’s not about making an alternative. It most definitely is about “beating” NVIDIA. Otherwise, you are just shoveling dollars - shareholders’, undercompensated employees at AMD and TSMC, etc. - to Meta, like everyone else.
People keep forgeting CUDA is not only about AI, graphics matter as well, as does being a polyglot ecosystem, the IDE integration, the graphical debugging tools, the libraries, having a memory model based on C++ memory model, and the last point is quite relevant, as NVidia employs a few key persons from C++ ecosystem that work on the ISO C++ standard (WG21).
It's not my job to reformat the entire AI market.
I'm willing to try AMD, and I even built an AMD-based machine to experiment with AI workflows. So far it has been failing miserably. I don't care that MI300X is compelling when I can't make samples work both on my desktop and on a cloud-based MI300X. I don't care about their academic collaborations, I'm not in the business of producing papers.
I'll just pay for H100 in the cloud to be sure that I will be able to run the resulting models on my 3090 locally and/or deploy to 4090 clusters.
If AMD shows some sense, commits to long-term support for their hardware with reasonable feature-parity across multiple generations, I'll reconsider them.
And AMD has a history of doing that! Their CPU division is _excellent_, they are renowned for having long-term support for motherboard socket types. I remember being able to buy a motherboard and then not worrying about upgrading the CPU for the next 3-4 years.