The bottleneck in common PC hardware is mostly memory bandwidth. Offloading the computation part to a different chip wouldn’t help if memory access is the bottleneck.
There have been a lot of boards and chips for years with dedicated compute hardware, but they’re only so useful for these LLM models that require huge memory bandwidth.
It is also to note that the bandwidth bus has seen very little upgrade over the years and even the onboard RAM on GPU card have seen mediocre upgrades. If everyone and their grandma wasn't using NVidia GPUs we would probably have seen a more competitive market and greater changes outside the chip itself.