logoalt Hacker News

tonfatoday at 6:13 PM0 repliesview on HN

> but at a slower rate that can be sustained by inference revenue. reply

also it's possible that the scale of inference needed (e.g. Jevons paradox) keeps growing to the point that training costs can fully be absorbed (since training cost is one off vs. inference that can scale).

(I suspect that might be the thinking, I don't know if it will be true, it's also possible that no model will create a moat big enough to attract enough of the inference traffic to make it true).

Depending on the chips/architecture used, the off-peak traffic from inference can also subsidize the training costs.