The value of these models is that you can run them on your own hardware.
A company can buy a NVIDIA B300 and serve it's developers in house with unlimited tokens.