There was an article on HN a few weeks ago where someone detailed how they managed to get an old datacenter GPU to run in their consumer PC, getting decent performance with qwen. He spent something like $200 on the GPU (second hand of course).
So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).
Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.