Why can't I see Cursor on tbench? Is it that bad that it's not even on the leaderboard? I ...

koakuma-chan • last Tuesday at 11:25 PM • 1 reply • view on HN

Why can't I see Cursor on tbench? Is it that bad that it's not even on the leaderboard? I am trying to figure out if I can pitch your product to my company, and whether it is worth it.

Replies

pacjam • last Tuesday at 11:37 PM

Not sure why Cursor CLI isn't on the leaderboard... I'm guessing it's because Cursor is focused primarily on their IDE agent, not their CLI agent, and Terminal-Bench is an eval/benchmark for CLI agents exclusively.

If you're asking about why Letta Code isn't on the leaderboard, the TBench maintainers said it should be up later today (so probably refresh in a few hours!). The results are already public, you can see them on our blog (graphs linked in the OP). They are also verifiable, all data is available for the runs + Letta Code is open source, so you can replicate the results yourself.

➕ show 1 reply

alt Hacker News

Replies