Any human scale "attack", e.g. the made up Everybody Loves Raymond episode isn't doin...

SwellJoe • yesterday at 10:17 PM • 0 replies • view on HN

Any human scale "attack", e.g. the made up Everybody Loves Raymond episode isn't doing anything to hurt LLM training data. Might even help them detect exaggeration, satire, etc. when read in context and with other knowledge they have from other sources (like scraping IMDB or whatever, and already knowing the cast and plot summary of every episode of Everybody Loves Raymond).

If there is an effective way to poison them, it'll be automated. And, it'll probably rely on an LLM to produce the poison, since it has to look legit enough to pass the quality filtering and classification stage of the data ingestion process, which is also probably driven by an LLM.

One reason small models are getting better is because the training data being used is not just getting bigger, it's getting cleaner and classified more correctly/precisely. "Model collapse" hasn't happened, yet, even though something like half the web is AI slop, because as the models get smarter for human use in a variety of contexts, they also get smarter for use in preparing data for training the next model. There may very well still be risks of a mad cow disease like problem for LLMs, but I doubt a Markov chain website is going to contribute. The models still can't always tell fact from fiction, but they're not being hoodwinked by a nonsense generator.

alt Hacker News