Compute costs are falling fast, training is getting cheaper. GPT-2 costs pocket change to train, and...

oscarmoxon • yesterday at 6:31 PM • 0 replies • view on HN

Compute costs are falling fast, training is getting cheaper. GPT-2 costs pocket change to train, and now it costs pocket train to tune >1T parameter models. If it was transparent what costs went into the weights, they could be commodified and stripped of bloat. Instead the hidden cost is building the infrastructure that was never tested at scale by anyone other than the original developers who shipped no documentation of where it fails. Unlike compute, this hidden cost doesn't commodify on its own.

alt Hacker News