logoalt Hacker News

refulgentistoday at 2:41 AM7 repliesview on HN

I'm very worried for both.

Cerebras requires a $3K/year membership to use APIs.

Groq's been dead for about 6 months, even pre-acquisition.

I hope Inception is going well, it's the only real democratic target at this. Gemini 2.5 Flash Lite was promising but it never really went anywhere, even by the standards of a Google preview


Replies

Leynostoday at 9:06 AM

Cerebras are on OpenRouter.

nltoday at 2:59 AM

Taalas is interesting. 16,000 TPS for Llama on a chip.

https://taalas.com/

show 3 replies
freeqaztoday at 2:51 AM

You can call Cerebras APIs via OpenRouter if you specify them as the provider in your request fyi. It's a bit pricier but it exists!

show 1 reply
ainchtoday at 3:09 AM

I don't think it's a good comparison given Inception work on software and Cerebras/Groq work on hardware. If Inception demonstrate that diffusion LLMs work well at scale (at a reasonable price) then we can probably expect all the other frontier labs to copy them quickly, similarly to OpenAI's reasoning models.

show 1 reply
estsauvertoday at 3:32 AM

I am currently using their APIs on a paygo plan, I think it might just be a capacity issue for new sign ups.

7thpowertoday at 3:03 AM

What do you mean by Grow is dead since about 6 months ago? Not refuting your point, but I’m curious.

show 1 reply
behnamohtoday at 5:06 AM

Once again, it's a tech that Google created but never turned into a product. AFAIK in their demo last year, Google showed a special version of Gemini that used diffusion. They were so excited about it (on the stage) and I thought that's what they'd use in Google search and Gmail.