CC-Canary: Detect early signs of regressions in Claude Code

41 points • by tejpalv • yesterday at 5:53 PM • 19 comments • view on HN

Comments

A useful(ish) trick I've found is adding a persona block to my CLAUDE.md. When it stops addressing me as 'meatbag' I know the HK-47 persona instructions are not being followed, which means other instructions are not being followed. Dumb trick? Yup. Does it work? Kinda? Does it make programming a lot more fun and funny? Heck yes.

Don't lecture me on basins of attraction--we all know HK is a great programmer.

jdiff • today at 2:00 AM

My attitude towards this is growing similar to my attitude towards Windows. If I have to fight against my tools and they are actively working against me, I'd rather save the sanity and time and just find a new tool.

evantahler • yesterday at 7:26 PM

I feel like asking the thing that you are measuring, and don’t trust, to measure itself might not produce the best measurements.

➕ show 1 reply

Retr0id • yesterday at 8:42 PM

What is "drift"? It seems to be one of those words that LLMs love to say but it doesn't really mean anything ("gap" is another one).

➕ show 4 replies

aleksiy123 • yesterday at 7:22 PM

Interesting approach, I've been particularly interested in tracking and being able to understand if adding skills or tweaking prompts is making things better or worse.

Anyone know of any other similar tools that allow you to track across harnesses, while coding?

Running evals as a solo dev is too cost restrictive I think.

➕ show 1 reply

redanddead • yesterday at 9:45 PM

the actual canary is the need for the canary itself

➕ show 1 reply

wongarsu • yesterday at 7:36 PM

See also https://marginlab.ai/trackers/claude-code-historical-perform... for a more conventional approach to track regressions

This project is somewhat unconventional in its approach, but that might reveal issues that are masked in typical benchmark datasets

Yemane5 • yesterday at 10:34 PM

thanks

tejpalv • yesterday at 5:53 PM

[dead]

alt Hacker News

CC-Canary: Detect early signs of regressions in Claude Code

Comments