I don't think thats necessarily true, they aren't really capacity constrained in practice ...

mhmmmmmm • today at 12:34 AM • 0 replies • view on HN

I don't think thats necessarily true, they aren't really capacity constrained in practice (they might be behind the scenes and adjust training on the fly, but thats speculation), so wasting tokens effectively helps utilize their (potentially idle) inference GPU's

alt Hacker News