They are. It's just not at the consumer hardware level.
This misconception is repeated time and time again; software support of their datacenter-grade hardware is just as bad. I've had the displeasure of using MI50, MI100 (a lot), MI210 (very briefly.) All three are supposedly enterprise-grade computing hardware, and yet, it was a pathetic experience with a myriad of disconnected components which had to be patched, & married with a very specific kernel version to get ANY kind of LLM inference going.
Now, the last of it I bothered with was 9 months ago; enough is enough.
You could argue it's all the nice GPU debugging tools nVidia provides which makes GPU programming accessible.
There are so many potential bottlenecks (normally just memory access patterns, but without tools to verify you have to design and run manual experiments).