No developer writes the same prompt twice. How can you be sure something has changed?

cactusplant7374 • today at 11:03 AM • 2 replies • view on HN

Replies

I regularly run the same prompts twice and through different models. Particularly, when making changes to agent metadata like agent files or skills.

At least weekly I run a set of prompts to compare codex/claude against each other. This is quite easy the prompt sessions are just text files that are saved.

The problem is doing it enough for statistical significance and judging the output as better or not.

baq • today at 12:32 PM

Ralph Wiggum would like a word

➕ show 1 reply

alt Hacker News

Replies