The last time I tested, I could swear they were doing some problematic quantization, because I was g...

ilaksh • yesterday at 11:49 PM • 0 replies • view on HN

The last time I tested, I could swear they were doing some problematic quantization, because I was getting kind of random results with one or two models, which worked perfectly when I switched providers.

It was really disappointing too because Cerebras does not provide any service reliability on their cheap plans. So I came to the conclusion that unless I could convince the client to set up an enterprise contract or something, we could not use either provider for low-latency, which we need for voice calls. I think for organizations that can afford a hefty contract that guarantees service levels, Groq and Cerebras especially are basically cheat codes for meeting latency requirements for voice. But that might not be an option for really small businesses.. although maybe I am just not a good sales person.

alt Hacker News