>I wish I better understood how ingesting and averaging large amounts of text produced such a suc...

Jblx2 • today at 1:05 AM • 1 reply • view on HN

>I wish I better understood how ingesting and averaging large amounts of text produced such a success in building syntactically-valid clauses

I wonder if these LLMs are succumbing to the precocious teacher's pet syndrome, where a student gets rewarded for using big words and certain styles that they think will get better grades (rather than working on trying to convey ideas better, etc).

Replies

coppsilgold • today at 1:28 AM

This is more or less what happens. These models are tuned with reinforcement learning from human feedback (RLHF). Humans give them feedback that this type of language is good.

The notorious "it's not X, it's Y" pattern is somewhat rare from actual humans, but it's catnip for the humans providing the feedback.

alt Hacker News

Replies