Next two years probably. But at some point we will either hit scales where you really dont need anything better (lets say cloud is 10000 token/s and local is 5000 token/s. Makes no difference for most individual users) or we will hit som wall where ai doesnt get smarter but cost of hardware continues to fall