Ornith-1.0: self-improving open-source models for agentic coding

114 points • by danboarder • today at 5:16 PM • 27 comments • view on HN

Comments

Previously: https://news.ycombinator.com/item?id=48709744

https://swelljoe.com/post/will-it-mythos/: "Poor performer here, only found the one bug that almost every model found, despite its performance on other benchmarks being excellent for its size. […] It also performs poorly in a chat without tools, exhibiting an ehthusiasm for hallucination. I’m currently working on a replication of this with full tool access, including bash/Python, which may allow this model to be competitive."

➕ show 2 replies

ricardobayes • today at 8:05 PM

This is the first Qwen fine-tune that is not immediately rejected by the local LLM community, and in some cases even being recommended. Based on my limited usage, it is good, gives creative solutions to coding problems. I don't expect 9-35B models to one-click create full apps. Most people who were complaining did so .

➕ show 2 replies

kennywinker • today at 6:23 PM

Can anyone explain what’s the story here? Is this just a re-skinned qwen? Who is deepreinforce-ai and why isn’t this model listed on their website?

How does it self-improve, does the model change on disk - or just during a single context run it gets better?

➕ show 2 replies

S0y • today at 7:53 PM

These are simply benchmaxxed versions of either Qwen or Gemma 4.

➕ show 1 reply

anana_ • today at 8:12 PM

They keep mentioning a 31B dense model, but there are no benchmarks or weights for it anywhere?

v3ss0n • today at 9:13 PM

Self-Improving bullshit. It is just Qwen 3.5 finetune benchmaxxed . Nothing spectacular . even fails at benchmarks. Long session tool calls sucks and hallucinate a lot with that too. Just use Qwen 3.6 and 3.5 122b.

fratefritto • today at 8:54 PM

[flagged]

alt Hacker News

Ornith-1.0: self-improving open-source models for agentic coding

Comments