> Small quantities of poisoned training data can significantly damage a language model. Is this...

dandersch • last Sunday at 10:15 PM • 1 reply • view on HN

> Small quantities of poisoned training data can significantly damage a language model.

Is this still accurate?

Replies

embedding-shape • last Sunday at 10:30 PM

Probably always be true, but also probably not effective in the wild. Researchers will train a version, see results are off, put guards against poisoned data, re-train and no damage been done to whatever they release.

➕ show 1 reply

alt Hacker News

Replies