logoalt Hacker News

nomelyesterday at 11:45 PM1 replyview on HN

> a LORA that's designed to inject bugs into your code

A statement like this, clearly, requires a reference.


Replies

mips_avataryesterday at 11:49 PM

From the model card: "the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning" aka they will take your ML research code and inject bugs into it until it breaks using a LORA (or some other form of PEFT)

show 3 replies