LLMs are trained to predict tokens on highly mediocre code though. How will it exceed its training ...

marginalia_nu • yesterday at 1:04 PM • 5 replies • view on HN

LLMs are trained to predict tokens on highly mediocre code though. How will it exceed its training data?

Replies

Probably the same way other models learned to surpass human ability while being bootstrapped from human-level data - using reinforcement learning.

The question is, do we have good enough feedback loops for that, and if not, are we going to find them? I would bet they will be found for a lot of use cases.

bluGill • yesterday at 1:21 PM

Because you ask it to improve things and so it produces slightly better than average results - the average person can find things wrong with something, and fix it as well. Then you feed that improved result back in and generate a model where the average is better.

/end extreme over optimism.

Retr0id • yesterday at 1:09 PM

Humans can decide to write above-average code by putting in more effort, writing comprehensive tests, iteratively refactoring, profile-informed optimization, etc.

I think you can have LLMs do that too, and then generate synthetic training data for "high-effort code".

➕ show 2 replies

utopiah • yesterday at 1:11 PM

Who are you to question our faith? /s

alt Hacker News

Replies