logoalt Hacker News

djfergustoday at 4:21 AM0 repliesview on HN

Reminds me of the terminus agent/harness on the terminal-bench coding benchmark - they just send send keystrokes to a tmux session. They score pretty well.

https://www.tbench.ai/news/terminus