Show HN: We analyzed 1,573 Claude Code sessions to see how AI agents work

81 points • by keks0r • today at 1:41 PM • 46 comments • view on HN

We built rudel.ai after realizing we had no visibility into our own Claude Code sessions. We were using it daily but had no idea which sessions were efficient, why some got abandoned, or whether we were actually improving over time.

So we built an analytics layer for it. After connecting our own sessions, we ended up with a dataset of 1,573 real Claude Code sessions, 15M+ tokens, 270K+ interactions.

Some things we found that surprised us: - Skills were only being used in 4% of our sessions - 26% of sessions are abandoned, most within the first 60 seconds - Session success rate varies significantly by task type (documentation scores highest, refactoring lowest) - Error cascade patterns appear in the first 2 minutes and predict abandonment with reasonable accuracy - There is no meaningful benchmark for 'good' agentic session performance, we are building one.

The tool is free to use and fully open source, happy to answer questions about the data or how we built it.

Comments

dmix • today at 3:22 PM

I've seen Claude ignore important parts of skills/agent files multiple times. I was running a clean up SKILL.md on a hundred markdown files, manually in small groups of 5, and about half the time it listened and ran the skill as written. The other half it would start trying to understand the codebase looking for markdown stuff for 2min, for no good reason, before reverting back to what the skill said.

LLMs are far from consistent.

➕ show 2 replies

sriramgonella • today at 2:59 PM

This kind of dataset is really valuable because most conversations about AI coding tools are based on anecdotes rather than actual usage patterns. I’d be curious about a few things from the sessions:

1.how often developers accept vs modify generated code 2.which tasks AI consistently accelerates (tests, refactoring, boilerplate?) 3.whether debugging sessions become longer or shorter with AI assistance

My experience so far is that AI is great for generating code but the real productivity boost comes when it helps navigate large codebases and reason about existing architecture.

➕ show 1 reply

Aurornis • today at 3:22 PM

> 26% of sessions are abandoned, most within the first 60 seconds

Starting new sessions frequently and using separate new sessions for small tasks is a good practice.

Keeping context clean and focused is a highly effective way to keep the agent on task. Having an up to date AGENTS.md should allow for new sessions to get into simple tasks quickly so you can use single-purpose sessions for small tasks without carrying the baggage of a long past context into them.

➕ show 1 reply

emehex • today at 2:23 PM

For those unaware, Claude Code comes with a built in /insights command...

➕ show 3 replies

KaiserPister • today at 2:48 PM

This is awesome! I’m working on the Open Prompt Initiative as a way for open source to share prompting knowledge.

➕ show 1 reply

blef • today at 2:40 PM

Reminds me https://www.agentsview.io/.

➕ show 2 replies

alyxya • today at 2:54 PM

Why does it need login and cloud upload? A local cli tool analyzing logs should be sufficient.

➕ show 1 reply

152334H • today at 2:14 PM

is there a reason, other than general faith in humanity, to assume those '1573 sessions' are real?

I do not see any link or source for the data. I assume it is to remain closed, if it exists.

➕ show 2 replies

marconardus • today at 2:14 PM

It might be worthwhile to include some of an example run in your readme.

I scrolled through and didn’t see enough to justify installing and running a thing

➕ show 1 reply

vova_hn2 • today at 2:37 PM

This is so sad that on top of black box LLMs we also build all these tools that are pretty much black box as well.

It became very hard to understand what exactly is sent to LLM as input/context and how exactly is the output processed.

➕ show 1 reply

ekropotin • today at 2:21 PM

> That's it. Your Claude Code sessions will now be uploaded automatically.

No, thanks

➕ show 1 reply

anthonySs • today at 2:59 PM

is this observability for your claude code calls or specifically for high level insights like skill usage?

would love to know your actual day to day use case for what you built

➕ show 1 reply

mentalgear • today at 3:08 PM

How diverse is your dataset?

➕ show 1 reply

cluckindan • today at 1:58 PM

Nice. Now, to vibe myself a locally hosted alternative.

➕ show 2 replies

lau_chan • today at 1:54 PM

Does it work for Codex?

➕ show 1 reply

Sebastian_Dev • today at 3:38 PM

[dead]

mrothroc • today at 2:18 PM

[dead]

➕ show 1 reply

huflungdung • today at 3:38 PM

[dead]

bhekanik • today at 2:01 PM

[dead]

multidude • today at 2:17 PM

[flagged]

➕ show 2 replies

mihir_kanzariya • today at 3:13 PM

[flagged]

➕ show 2 replies

ozgurozkan • today at 2:55 PM

[flagged]

➕ show 2 replies

alt Hacker News

Show HN: We analyzed 1,573 Claude Code sessions to see how AI agents work

Comments