They aren't good at Zork[1] and neither at newer and/or more obscure text adventures[2].
[1]: https://www.lowimpactfruit.com/p/zork-bench-an-llm-reasoning...
[2]: https://entropicthoughts.com/evaluating-llms-playing-text-ad...