logoalt Hacker News

kingstnapyesterday at 11:29 PM0 repliesview on HN

You underestimate the amount of inference and very much overestimate what training is.

Training is more or less the same as doing inference on an input token twice (forward and backward pass). But because its offline and predictable it can be done fully batched with very high utilization (efficiently).

Training is guestimate maybe 100 trillion total tokens but these guys apparently do inference on the quadrillion token monthly scales.