One thing my team lead is working on is using Claude to 'generate' integration tests/...

danudey • today at 12:58 AM • 2 replies • view on HN

One thing my team lead is working on is using Claude to 'generate' integration tests/add new tests to e2e runs.

Straight up asking Claude to run the tests, or to generate a test, could result in potential inconsistencies between runs or between tests, between models, and so on, so instead he created a tool which defines a test, inputs and outputs and some details. Now we have a system where we have a directory full of markdown files describing a test suite, parameters, test cases, error cases, etc., and Claude generates the usage of the tool instead.

This means that whatever variation Claude, or any other LLM, might have run-to-run or drift over time, it all still has to be funneled through a strictly defined filter to ensure we're doing the same things the same way over time.

Replies

latentsea • today at 1:55 AM

I'm looking at implementing https://github.com/coleam00/Archon as a means to solve this. You can build arbitrary workflows custom to your codebase. Looks to bring a bit of much-needed determinism.

zx8080 • today at 1:01 AM

What kind of system/area (or product) are you working on?

alt Hacker News

Replies