logoalt Hacker News

happyopossumlast Wednesday at 7:43 PM0 repliesview on HN

Sure, but how often is an enterprise deployed LLM application really cold-starting? While you could run this for one-off and personal use, this is probably more geared towards bursty ‘here’s an agent for my company sales reps’ kind of workloads, so you can have an instance warmed, then autoscale up at 8:03am when everyone gets online (or in the office or whatever).

At that point, 19 seconds looks great, as lower latency startup times allow for much more efficient autoscaling.