No amount of valuation can fix global supply issues for GPUs for inference unfortunately. I suspec...

jjcm • today at 3:15 PM • 6 replies • view on HN

No amount of valuation can fix global supply issues for GPUs for inference unfortunately.

I suspect they're highly oversubscribed, thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).

Replies

natpalmer1776 • today at 3:17 PM

Remember when OpenAI wasn’t allowing new subscriptions to their ChatGPT pro plans because they were oversubscribed? Pepperidge Farms remembers.

➕ show 1 reply

scratchyone • today at 3:33 PM

maybe, but the response to GPU shortages being increased error rates is the concern imo. they could implement queuing or delayed response times. it's been long enough that they've had plenty of time to implement things like this, at least on their web-ui where they have full control. instead it still just errors with no further information.

➕ show 2 replies

AlecSchueler • today at 6:13 PM

> thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).

The dynamic thinking and response length is funny enough the best upgrade I've experienced with the service for more than a year. I really appreciate that when I say or ask something simple the answer now just comes back as a single sentence without having to manually toggle "concise" mode on and off again.

zachncst • today at 3:57 PM

Sure but we don't need GPUs to log in.

sobellian • today at 3:30 PM

Their issues seem to extend well beyond inference into services like auth.

➕ show 1 reply

paulddraper • today at 5:54 PM

A. These aren’t rate limit errors from the API.

B. Everything is down, even auth.

alt Hacker News

Replies