Because when you pay for a subscription they don't silently quantize the model a few week after release, and you can no longer get the full model running.
Otherwise no need for full fp16, int8 works 99% as well for half the mem, and the lower you go the more you start to pay for the quants. But int8 is super safe imo.