logoalt Hacker News

dathinabtoday at 4:14 PM2 repliesview on HN

humans are very good at overlooking edge cases, off by one errors etc.

so if you generate test data randomly you have a higher chance of "accidentally" running into overlooked edge cases

you could say there is a "adding more random -> cost" ladder like

- no randomness, no cost, nothing gained

- a bit of randomness, very small cost, very rarely beneficial (<- doable in unit tests)

- (limited) prop testing, high cost (test runs multiple times with many random values), decent chance to find incorrect edge cases (<- can be barely doable in unit tests, if limited enough, often feature gates as too expensive)

- (full) prop testing/fuzzing, very very high cost, very high chance incorrect edge cases are found IFF the domain isn't too large (<- a full test run might need days to complete)


Replies

ssdspoimdsjvvtoday at 4:19 PM

I've learnt that if a test only fails sometimes, it can take a long time for somebody to actually investigate the cause,in the meantime it's written off as just another flaky test. If there really is a bug, it will probably surface sooner in production than it gets fixed.

show 2 replies
SkyBelowtoday at 6:05 PM

Can't one get randomness and determinism at the same time? Randomly generate the data, but do so when building the test, not when running the test. This way something that fails will consistently fail, but you also have better chances of finding the missed edge cases that humans would overlook. Seeded randomness might also be great, as it is far cleaner to generate and expand/update/redo, but still deterministic when it comes time to debug an issue.

show 1 reply