If you have a Cerebras Code subscription you can experience it right now. Indeed, a very different e...

philipp-gayret • yesterday at 6:29 PM • 2 replies • view on HN

If you have a Cerebras Code subscription you can experience it right now. Indeed, a very different experience.

Replies

Used them for a while! They didn't seem to have prompt caching so I burnt through the daily 24M token limitations really quickly when doing large scale changes on a codebase (essentially a team's worth of menial migration/refactoring work). A lot of it was okay, but plenty had to be re-done and I still spotted some issues months down the line, in part I blame their model catalogue which did get an update to GLM 4.7 sometime way back, but definitely is showing its age: https://inference-docs.cerebras.ai/models/overview

Quality wise, Anthropic gives me the best results (Opus for almost everything, I make sub-agents with fresh context review its work, after 2-10 loops, usually finds most issues). Token amount wise for agentic work, DeepSeek V4 is up there. What Cerebras is doing pretty cool though, apparently they even have prompt caching now like the other big providers: https://inference-docs.cerebras.ai/capabilities/prompt-cachi... At the same time, producing bad code faster was annoying in a uniquely new way.

Wish they'd update the models with their subscription, it could genuinely be great with the proper harness. Like if they can run GLM 4.7, surely they could at least get DeepSeek V4 Flash with a big context window going as a starting point. How can you have so much money to make your own chips, but can't run modern models that you can get for free? It's like they don't want people to use their subscription.

➕ show 1 reply

dkersten • yesterday at 6:35 PM

It’s GLM 4.7, GPT OSS 120B, or llama 3.1 8B so not exactly the latest or best models.

But GLM is good enough for many small tasks, certainly enough to get a taste for Cerebras’ high speeds!

[edit: actually that’s just their general models, I can’t see what Cerebras code offers. It was Qwen-coder when it launched but I don’t know what it is now. I think GLM 4.7 but I’m not completely sure]

➕ show 1 reply

alt Hacker News

Replies