Depending on what you're doing its taking up to 8GPUs working in parallel to serve those queries.
Yes but then the batch size is in 100s or even 1000s. These GPU doesn't serve just 1 user at a time.
Yes but then the batch size is in 100s or even 1000s. These GPU doesn't serve just 1 user at a time.