>I know it doesn't make financial sense to self-host given how cheap OSS inference APIs are now
You can calculate the exact cost of home inference, given you know your hardware and can measure electrical consumption and compare it to your bill.
I have no idea what cloud inference in aggregate actually costs, whether it’s profitable or a VC infused loss leader that will spike in price later.
That’s why I’m using cloud inference now to build out my local stack.
Not concerned with electricity cost - I have solar + battery with excess supply where most goes back to the grid for $0 compensation (AU special).
But I did the napkin math on M3 Ultra ROI when DeepSeek V3 launched: at $0.70/2M tokens and 30 tps, a $10K M3 Ultra would take ~30 years of non-stop inference to break even - without even factoring in electricity. You clearly don't self-host to save money. You do it to own your intelligence, keep your privacy, and not be reliant on a persistent internet connection.