Microsoft recently announced changes to copilot because, apparently, it was losing money on inference.
They were loosing money giving absurdly generous agentic usage on expensive models to people with $10 to $40 flat rate subscriptions.
They weren't selling inference.
They were charging a flat rate per query no matter how many tokens it consumed. People naturally got very good at writing prompts that used as many tokens as possible.