logoalt Hacker News

marginalia_nuyesterday at 1:04 PM5 repliesview on HN

LLMs are trained to predict tokens on highly mediocre code though. How will it exceed its training data?


Replies

movedx01yesterday at 1:27 PM

Probably the same way other models learned to surpass human ability while being bootstrapped from human-level data - using reinforcement learning.

The question is, do we have good enough feedback loops for that, and if not, are we going to find them? I would bet they will be found for a lot of use cases.

bluGillyesterday at 1:21 PM

Because you ask it to improve things and so it produces slightly better than average results - the average person can find things wrong with something, and fix it as well. Then you feed that improved result back in and generate a model where the average is better.

/end extreme over optimism.

Retr0idyesterday at 1:09 PM

Humans can decide to write above-average code by putting in more effort, writing comprehensive tests, iteratively refactoring, profile-informed optimization, etc.

I think you can have LLMs do that too, and then generate synthetic training data for "high-effort code".

show 2 replies
utopiahyesterday at 1:11 PM

Who are you to question our faith? /s