When reading HN I get the impression that a lot of people are convinced monthly plans are very profitable for the companies, I don’t have any numbers but to me it always seemed like a bait and switch or ”bait and make you pay with your data too”.
I'll bite. I suspect that these plans aren't as intensely subsidized as people assume. I believe that API usage is probably also not subsidized at all. First, yes, subs are probably subsided, but I bet a significant % of users are profitable to serve, especially the "chat" users who don't use dev tools and have short context window conversations. Yes, I think the subs also exist as a driver to get lock-in and market share. Claude Code, for example, is very good and I stopped using their competition when they released their superior product.
That said, I assume that (1) their long-term goal is to create cheaper-to-serve models that fit within their pricing targets, and use the (temporarily) subsidized subscriptions to find the features and costs that best serve the market. Maybe even while capturing more margin on the API in comparison (eg keep API prices high while lowering cost to serve a token). I've largely stopped using Opus, and sometimes even chose to use Haiku, because the cheaper models are fast and usually serves my needs. It's very possible to work all-day and barely hit the usage limits with Haiku on the $20/mo option. Long term, that could be profitable outright.
And (2) subscriptions with lower SLOs than API calls have the potential to provide "infill" usage for high fixed-cost GPUs as an alternative to idling, similar to their batch APIs. I'd believe that overnight usage limits could/should be higher than during California work-hours. I assume most big providers have pre-paid fixed cost servers, so pumping more tokens through an otherwise idle GPU is "free". They can also do a lot more cost-optimization behind the scenes, such as prompt caching, to reduce the cost of tokens.
I'll bite. I suspect that these plans aren't as intensely subsidized as people assume. I believe that API usage is probably also not subsidized at all. First, yes, subs are probably subsided, but I bet a significant % of users are profitable to serve, especially the "chat" users who don't use dev tools and have short context window conversations. Yes, I think the subs also exist as a driver to get lock-in and market share. Claude Code, for example, is very good and I stopped using their competition when they released their superior product.
That said, I assume that (1) their long-term goal is to create cheaper-to-serve models that fit within their pricing targets, and use the (temporarily) subsidized subscriptions to find the features and costs that best serve the market. Maybe even while capturing more margin on the API in comparison (eg keep API prices high while lowering cost to serve a token). I've largely stopped using Opus, and sometimes even chose to use Haiku, because the cheaper models are fast and usually serves my needs. It's very possible to work all-day and barely hit the usage limits with Haiku on the $20/mo option. Long term, that could be profitable outright.
And (2) subscriptions with lower SLOs than API calls have the potential to provide "infill" usage for high fixed-cost GPUs as an alternative to idling, similar to their batch APIs. I'd believe that overnight usage limits could/should be higher than during California work-hours. I assume most big providers have pre-paid fixed cost servers, so pumping more tokens through an otherwise idle GPU is "free". They can also do a lot more cost-optimization behind the scenes, such as prompt caching, to reduce the cost of tokens.