In spite of their deeper pockets, massive datacenters, colosal amounts of user data, and hundreds of...

alecco • today at 11:20 AM • 9 replies • view on HN

In spite of their deeper pockets, massive datacenters, colosal amounts of user data, and hundreds of thousands of top developers, even Amazon, Meta, Microsoft, and Google are well behind.

I think Evans is completely wrong. There are only 2 truly frontier models. (at least for now). And Anthropic seems to be leaving OpenAI behind so there might be only 1 in the near future. (which is scary/dangerous)

Replies

ksec • today at 12:12 PM

>I think Evans is completely wrong.

I wish there was a case where I find Evans is wrong. As far as my memory served me, I failed to record a single one.

I disagree that Amazon, Meta, Microsoft, and Google are "well" behind. If anything the frontier model advantage seems to be at best 6 - 9 months. And that the Chinese model are all doing well.

One of Steve Jobs's line, "It is a feature, not a product." Even if Apple were a generation behind or 1 year behind frontier model. The advantage of default is enough to hold a lot of its user.

To put it simply, even if OpenAI or Anthropic were better, there is zero chances they would topple Apple in hardware sales, user or ecosystem. On the other hand, even if Apple's AI were 6 - 9 months or a generation behind, most user would settle for it and damage OpenAI / Anthropic.

➕ show 3 replies

hedora • today at 2:17 PM

Remember the implicit “pareto” in “frontier models”.

Anthropic and OpenAI are far behind state of the art for the entire curve except the “extremely expensive for barely measurable improvements” part.

GLM is probably the third most expensive frontier model (benchmarks and reviews will say for sure), and is apparently ~Opus 4.6 for 10% the inference cost.

The last I checked, qwen was still owning the 24-32GiB RAM range (it runs reasonably without a GPU!) and somewhere around 3.5-4 generation models.

Also, even anthropic says Mythos ~= ChatGPT 5.5, so it’s unlikely either one is leaving the other behind. The big problem they both have is they asked for the government to gate keep model releases and use cases, and their wish was granted.

That’s knocked them back 6 months already. Anthropic’s only frontier offering has been taken down.

tedggh • today at 12:16 PM

I use both Claude and Codex and don’t see any meaningful difference between the two. My use case is modeling semi complex physical processes (energy and manufacturing) in code for simulations. I also have to do a good fair of automation via scripting in Python or PowerShell for manipulating data as well as legacy code analysis (C, Fortran, COBOL). Given I provide the models with the information and documentation they need, both perform very similarly. I recently did a full codebase review (for design patterns and vulnerabilities) and both Codex and Fable agreed 100% about the most critical findings. I do very little front end development, although some of my automation scripts have TUIs and again no problem with either Claude or Codex generating them for me. At this point I go with the less expensive, which seems to be Codex. With the $100 plan I rarely hit the limits. With Claude I max out my plan in about 4-6 hours of work.

➕ show 1 reply

awongh • today at 7:14 PM

That's true now, but long-term (maybe just a few years) it doesn't seem feasible for the status quo to continue from a financial point of view.

Spend for compute seems like it needs to increase to get the next iterations of models, and even if they IPO the money might run out before they can solidify their revenue streams.

All while Google just needs to survive long enough with their good-enough models and do it without really putting themselves in any existential financial risk.

And ideally the chinese models are also still there keeping everyone honest.

The true dystopic worst case is a Google monopoly on cutting edge AI.

jimbokun • today at 4:05 PM

Is Google behind? The general opinions I read suggest Gemini is very competitive with Anthropic and OpenAI's top models.

wolttam • today at 2:24 PM

I think it's highly likely that there will remain one or two companies on the very bleeding edge of AI development for the foreseeable future.

But what I think a lot of people miss is that the market for the truly bleeding edge (developing bio-tech, building the most sophisticated software stacks (probably with a tilt towards simulation, GPU kernel optimization, etc)) is not the whole market.

There's a plethora of use-cases for models that are not on the bleeding edge. If I can solve my relatively simple problems with an off-the-shelf model for a minuscule fraction of the cost of the frontier, I'm going to.

➕ show 1 reply

embedding-shape • today at 11:24 AM

> I think Evans is completely wrong. There are only 2 truly frontier models. (at least for now). And Anthropic seems to be leaving OpenAI behind so there might be only 1 in the near future. (which is scary/dangerous)

Truly fascinating ecosystem and community in general, as experiences differ so wildly. Anthropic's models seems far behind OpenAI to me, especially when you get into "Pro" territory, and there doesn't seem to be any worthy competition to Pro Mode available at all.

And this is said with someone who use both platforms, and spend a lot of my day interacting with agents and LLMs in various ways. The interesting part is that probably so do you too, and probably your experience and what you share lines up with what you experience! Yet we come away with basically opposite takeaways :) I don't think either of us are wrong either, somehow.

➕ show 4 replies

bushbaba • today at 4:17 PM

I'm perfectly happy at claude opus 4.6. All improvements since then have not meaningfully improved my day to day. If i can get 4.6 on my laptop for 5-10k, i'd gladly start shifting my ~1k/month Anthropic spend over.

Some of the harness even let you run a local model for most things, and only pay for the latest frontier models when needed, which cuts down cost drastically.

afavour • today at 12:07 PM

Maybe I’m alone in thinking this but I think the long term victor will be the one that works out pricing best.

Fable might well be a better model but it’s too expensive for everyday AI use. Definitely if we’re talking about the kind of stuff you’re going to want to do on your phone. Even for coding, I’m not going to reach for Fable (well, when I can…) for 95% of the work I do.

I don’t believe a mature AI industry is going to have a one size fits all, single winner.

➕ show 1 reply

alt Hacker News

Replies