Qwen-AgentWorld: Language World Models for General Agents

100 points • by ilreb • today at 2:21 AM • 27 comments • view on HN

Comments

Xx_crazy420_xX • today at 7:59 AM

I think open-ended simulation for agents will be a key component for training and planning. Similar as human dreams simulate different scenarios in our head. Biggest challenge will be simulating more abstract and complex systems.

Few months ago I did experiment with an open-ended world simulation for AI agent, where the simulated world was progressively building itself based on each of agent actions in open-ended manner. The idea was to give an agent infinite possibility regarding tool calling, where the tool call would be approved by the adjudicator, and the world state would change. The key issues with the PoC were:

  - World decoherence (tried to solve that with a poor graph implementation)
  - World flatness - high abstraction did not account for small events that would compound in real world
  - Start with empty context was real issue to get the agent to explore the world

Anyways the project came to be really funny when you watched agent struggling in desperation to perform real world actions which would be impossible in real world. Main observation was that when presented agent with current action budget, it modulated the creativity and how desperate its actions were.

➕ show 3 replies

adrian_b • today at 7:29 AM

The smaller of the two models is open weights and available on Huggingface:

https://huggingface.co/Qwen/Qwen-AgentWorld-35B-A3B

➕ show 1 reply

blurbleblurble • today at 6:06 AM

This might be pretty big. One of my biggest frustrations with smaller models (especially MoE) is their failure to track workflow state at a high level. I'm constantly reminding them what we decided on or asking them to revisit, and reminding them eats context.

Seems like this might make that a lot less painful. And if not off the bat, with some minimal tuning or even just good prompting.

dippogriff • today at 5:47 AM

I'm a fan of this direction. For me the most interesting use case for these world models isn't even training, it's verification. If this thing or some idealized version of it can actually reliably simulate state transitions, could you use it to verify an agent's execution path against hard constraints and replace/eclipse LLMs-as-a-judge?

➕ show 1 reply

avaer • today at 8:10 AM

Note this can run locally on a gaming card with quant. I got it running on a 4090 (24GB) 150 t/s with a Q4_K_M.

psc007 • today at 5:16 AM

Eli5? What is this compared to a regular llm assistant model like the base qwen?

➕ show 1 reply

aliljet • today at 7:24 AM

The benchmarks here are confusing at best. Am I reading correctly that this model is essentially as good or better than all frontier models right now?

➕ show 2 replies

zkmon • today at 7:59 AM

What if they did this using GLM 5.2? This looks like a new direction for AI.

ElenaDaibunny • today at 7:25 AM

10M trajectories, probably more of a data scale win than a world model breakthrough tbh

Tepix • today at 4:50 AM

The labels of the very first chart (figure 1, bottom left) are obviously wrong which casts a doubt on the entire paper.

➕ show 1 reply

verdverm • today at 4:37 AM

35B model from the qwen-3.5 line

https://github.com/QwenLM/Qwen-AgentWorld

https://huggingface.co/Qwen/Qwen-AgentWorld-35B-A3B