logoalt Hacker News

OtherShrezzingyesterday at 7:17 PM1 replyview on HN

I think in the relatively near future we’re going to start seeing sophisticated supply chain attacks into language model training data.

It should be feasible to design vulnerabilities which look benign individually in training data, but when composed together in the agent plane & executed in a chain introduce an exploit.

There’s nothing technical really stopping that from existing right now. It’s just that nobody has put the effort in yet.


Replies

lacunaryyesterday at 8:19 PM

The develop-test-refine feedback loop for this kind of attack is so long (or expensive) that it seems likely to limit its real world use. Poison training data, wait months? a year? for the model to come out, see how well it worked, refine... or do you see a faster way to iterate?

show 1 reply