logoalt Hacker News

postalcoderyesterday at 3:50 PM4 repliesview on HN

This release Mistral really reminds you of the gap between the frontier labs and everyone else.

Pre-agent, there wasn't always an obvious difference between models. Various models had their charms. Nowadays, I don't want to entertain anything less than the frontier models. The difference in capability is enormous and choosing anything less has a real cost in terms of productivity.

I've been a big fan of the smaller labs like Mistral and especially Cohere but it's been a while since I've been excited by a release by either company.

That said, I'm using mistral voxtral realtime daily – it's great.


Replies

deauxyesterday at 4:07 PM

Can't agree at all. Productivity gap just 1 year ago was much larger for frontier model vs non-frontier. Let alone 2 years ago.

show 2 replies
onlyrealcuzzoyesterday at 4:05 PM

> Pre-agent, there wasn't always an obvious difference between models. Various models had their charms. Nowadays, I don't want to entertain anything less than the frontier models. The difference in capability is enormous and choosing anything less has a real cost in terms of productivity.

It's just apples to oranges.

There is not a clear, across the board, winner on non-agentic tasks between Gemini, ChatGPT, and Claude - the simple chatbot interface.

But Claude Code is substantially better than Codex which itself is notably better than Gemini-cli.

In this vein, it should not be surprising that Claude Code is way better than non-frontier models for agentic coding... It's substantially better than other frontier models at specialized agentic tasks.

show 3 replies
locknitpickeryesterday at 4:06 PM

> Pre-agent, there wasn't always an obvious difference between models. Various models had their charms. Nowadays, I don't want to entertain anything less than the frontier models.

This is a very naive and misguided opinion. In most tasks, including complex coding tasks, you can hardly tell the difference between a frontier model and something like GPT4.1. You need to really focus on areas such as context window, tool calling and specific aspects of reasoning steps to start noticing differences. To make matters worse, frontier models are taking a brute force approach to results which ends up making them far more expensive to run, both in terms of what shows up on your invoice and how much more you have to wait to get any resemblance of output.

And I won't even go into the topic or local models.

show 1 reply
sayYayToLifeyesterday at 6:25 PM

[dead]