Anthropic has indicated in the past that API gross margins are ~60%. This might have improved since then, though competition from OAI puts a ceiling on that.
Subscription inference can also be cheaper than the cost of API inference if the provider wants it to -- providers can do flexible scheduling for subscription inference for example, around API inference, to lower its cost and get better utilization of the hardware.
Subscription inference can also be cheaper than the cost of API inference if the provider wants it to -- providers can do flexible scheduling for subscription inference for example, around API inference, to lower its cost and get better utilization of the hardware.