logoalt Hacker News

jrandolfyesterday at 5:36 PM3 repliesview on HN

We implement rate-limiting and queuing to ensure fairness, but if there are a massive amount of people with huge and long queries, then there will be waits. The question is whether people will do this and more often than not users will be idle.


Replies

mogili1yesterday at 6:19 PM

Rate limit essentially is a token limit

show 1 reply
freedombenyesterday at 5:48 PM

Is there any way to buy into a pool of people with similar usage patterns? Maybe I'm overthinking it, but just wondering

show 1 reply
petterroeayesterday at 6:46 PM

To be fair this is the price you pay for sharing a GPU. Probably good for stuff that doesn't need to be done "now" but that you can just launch and run in the background. I bet some graphs that show when the gpu is most busy could be useful as well