This is most likely an inference serving problem in terms of capacity and latency given that Opus X ...

bt1a • yesterday at 10:04 PM • 0 replies • view on HN

This is most likely an inference serving problem in terms of capacity and latency given that Opus X and the latest GPT models available in the API have always responded quickly and slowly, respectively

alt Hacker News