logoalt Hacker News

jjcmtoday at 3:15 PM6 repliesview on HN

No amount of valuation can fix global supply issues for GPUs for inference unfortunately.

I suspect they're highly oversubscribed, thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).


Replies

natpalmer1776today at 3:17 PM

Remember when OpenAI wasn’t allowing new subscriptions to their ChatGPT pro plans because they were oversubscribed? Pepperidge Farms remembers.

show 1 reply
scratchyonetoday at 3:33 PM

maybe, but the response to GPU shortages being increased error rates is the concern imo. they could implement queuing or delayed response times. it's been long enough that they've had plenty of time to implement things like this, at least on their web-ui where they have full control. instead it still just errors with no further information.

show 2 replies
AlecSchuelertoday at 6:13 PM

> thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).

The dynamic thinking and response length is funny enough the best upgrade I've experienced with the service for more than a year. I really appreciate that when I say or ask something simple the answer now just comes back as a single sentence without having to manually toggle "concise" mode on and off again.

zachncsttoday at 3:57 PM

Sure but we don't need GPUs to log in.

sobelliantoday at 3:30 PM

Their issues seem to extend well beyond inference into services like auth.

show 1 reply
paulddrapertoday at 5:54 PM

A. These aren’t rate limit errors from the API.

B. Everything is down, even auth.