logoalt Hacker News

KaiserProlast Friday at 10:43 PM1 replyview on HN

Depending on what you're doing its taking up to 8GPUs working in parallel to serve those queries.


Replies

YetAnotherNickyesterday at 2:45 AM

Yes but then the batch size is in 100s or even 1000s. These GPU doesn't serve just 1 user at a time.