logoalt Hacker News

chasd00today at 2:00 PM1 replyview on HN

from what i understand, the issue with inference is it doesn't scale as user count grows the way traditional saas scales. In typical saas adding users requires very little additional capacity. However with inference, supporting more users requires much more capacity to be added. I don't know if it's quite linear but it certainly requires more infrastructure to support additional LLM users than say a web application.


Replies

seanw444today at 5:17 PM

And the existing infrastructure routinely struggles for several of the well known players. You can literally tell when it's getting bogged down by workload. And that's after all the absurdly large datacenters we've already established at significant expense (to both the corporations and the average person).