logoalt Hacker News

schnebbautoday at 4:52 PM2 repliesview on HN

This has to be load related. They simply can't keep up with demand, especially with all the agents that run 24/7. The only way to serve everyone is to dial down the power.


Replies

layer8today at 4:55 PM

In TFA, the analysis shows that the customer is using more tokens than before, because CC has to iterate longer to get things right. So at least in the presented case, “dialing down the power” appears to have been counterproductive.

chasd00today at 5:44 PM

is it possible to dial down the "intelligence" to up the user capacity? AFAIK the neural net is either loaded and available or it isn't. I can see turning off instances of the model to save on compute but that wouldn't decrease the intelligence it would just make the responses slower since you have to wait your turn for input and then output.