>With this many parameters overfitting is inevitable. Nope. Go look up double descent. Overfitt...

Legend2440 • yesterday at 10:41 PM • 1 reply • view on HN

>With this many parameters overfitting is inevitable.

Nope. Go look up double descent. Overfitting turns out not to be an issue with large models.

Your video is from a political activist, not anyone with any knowledge about machine learning. Here's a better video about overfitting: https://youtu.be/qRHdQz_P_Lo

Replies

runarberg • yesterday at 11:04 PM

I am not a professional statistician (only a BSc dropout) so I won‘t be able to gain the expertise required to evaluate the claim here: That double descent eliminates overfitting in LLMs.

That said, I see red flags here. This is an extraordinary claim, and extraordinary claims require extraordinary evidence. My actual degree (not the drop-out one) is in Psychology and I used statistics a lot during my degree, it is only BSc so again, I cannot claim expertise here either. But this claim and the abstracts I scanned in various papers to evaluate this claim, ring alarm bells all over. I don‘t trust it. It is precisely the thing that we were told to be aware of when we were taught scientific thinking.

In contrast, this political activist provided an example (an anecdote if you will) which showed how easy it was for an actual scientist to poison LLM models with a made up symptom. This looks like overfitting to me. These two Medium blog posts very much feel like errors in the data set which the models are all to happy to output as if it was inferred.

EDIT: I just watched that video, and I actually believe the claims in the video, however I do not believe your claim. If we assume that video is correct, your errors will only manifest in fewer hallucinations. Note that the higher parameter models in the demonstration the regression model traversed every single datapoint the sample, and that there was an optimal model with fewer parameters which had a better fit then the overfitted ones. This means that trillions of parameters indeed makes a model quite vulnerable to poison.

➕ show 1 reply

alt Hacker News

Replies