logoalt Hacker News

cowboy_henktoday at 11:09 AM2 repliesview on HN

This only holds if you understand what's in the tests, and the tests are realistic. The moment you let the LLM write the tests without understanding them, you may as well just let it write the code directly.


Replies

chriswarbotoday at 3:32 PM

> The moment you let the LLM write the tests without understanding them, you may as well just let it write the code directly.

I disagree. Having tests (even if the LLM wrote them itself!) gives the model some grounding, and exposes some of its inconsistencies. LLMs are not logically-omniscient; they can "change their minds" (next-token probabilities) when confronted with evidence (e.g. test failure messages). Chain-of-thought allows more computation to happen; but it doesn't give the model any extra evidence (i.e. Shannon information; outcomes that are surprising, given its prior probabilities).

rowanG077today at 1:55 PM

I disagree to some degree. Tests have value even beyond whether they test the right thing. At the very least they show something worked and now doesnt work or vice versa. That has value in itself.