> AMD’s software experience is riddled with bugs [...] AMD’s weaker-than-expected software Quality Assurance (QA) culture and its challenging out of the box experience.
This has anecdotally been true since forever. Back in the day, OpenCL implementations were passing conformance test but performance was poor. They could not turn hardware capabilities into performance for compute users. Drivers were buggy. Documentation was poor compared to NVidia's docs and forum. Offerings were inconsistent (look up Sycl from Codeplay) and ownership of what it is like to develop for AMD was unclear. The notion that it might not have improved or is only now improving is puzzling. It can't be for the lack of recognizing the problem. Intuitively it does not seem so difficult. I'm curious what the reasons are.
FWIW Back in 2015 OpenCL 2.0 performance was quite good on then-current AMD GPUs (IMO), but the problem was that 1. You had to implement everything yourself, from scratch, since AMD's GPU BLAS was barely compilable, and 2. They abandoned OpenCL that year, and switched to HIP (or whatever their copy of CUDA was called) which didn't even compile (in practice) for quite some time, and 3. Even with HIP, you were on your own when it comes for any BLAS and other standard library implementations because AMD provided nothing of the sorts for a long time.
All in all, it's not that the drivers performance was poor per se, but AMD did nothing about providing a software ecosystem, which amount to its hardware wasn't realistically usable unless your pockets were so big that you can do AMD's job and fund the re-development of the whole ecosystem from scratch.
In other words, it made MUCH better ROI to just use Nvidia, pay a little bit more for the hardware, and save millions on software :)
Even before ATI was acquired by AMD they had driver support problems.
When I was working for a Unix commercial graphics software company, the CTO told me how bad the information he received under ATI’s NDA was: different revisions of the same chipset had contradictory register settings, so the driver had to identify the revision before writing a value to the write-only configuration registers. The same chipset might need a 0 or a 1; writing the wrong values could crash the driver.