logoalt Hacker News

cactusplant7374today at 11:03 AM2 repliesview on HN

No developer writes the same prompt twice. How can you be sure something has changed?


Replies

kasey_junktoday at 11:42 AM

I regularly run the same prompts twice and through different models. Particularly, when making changes to agent metadata like agent files or skills.

At least weekly I run a set of prompts to compare codex/claude against each other. This is quite easy the prompt sessions are just text files that are saved.

The problem is doing it enough for statistical significance and judging the output as better or not.

baqtoday at 12:32 PM

Ralph Wiggum would like a word

show 1 reply