logoalt Hacker News

danderschlast Sunday at 10:15 PM1 replyview on HN

> Small quantities of poisoned training data can significantly damage a language model.

Is this still accurate?


Replies

embedding-shapelast Sunday at 10:30 PM

Probably always be true, but also probably not effective in the wild. Researchers will train a version, see results are off, put guards against poisoned data, re-train and no damage been done to whatever they release.

show 1 reply