logoalt Hacker News

kqrtoday at 11:29 AM1 replyview on HN

But LLMs are terrible at text adventures too. See e.g. https://entropicthoughts.com/updated-llm-benchmark and previous articles referenced in there.

I have yet to see any sort of harness that lets a frontier LLM interact with a text adventure and make meaningful progress on its own.


Replies

haffi112today at 11:43 AM

They are also pretty bad at navigating mazes (which can be somewhat similar in spirit to text adventures where you need to navigate through text): https://arxiv.org/abs/2507.20395