logoalt Hacker News

Yokohiiitoday at 3:15 AM1 replyview on HN

Those yottabytes of VRAM are also consuming electricity constantly.


Replies

fluoridationtoday at 3:24 AM

The difference being that an LLM request is not an operating system. Since they're compartmentalized and ephemeral, you can very easily distribute requests among your available hardware so that you can switch off machines during periods of low activity.

show 1 reply