In contrary: In an Interview someone from OpenAI said they are trying to avoid it because it makes i...

AntiUSAbah • yesterday at 6:07 PM • 1 reply • view on HN

In contrary: In an Interview someone from OpenAI said they are trying to avoid it because it makes it harder for them to determine if a model gets better or not.

Replies

thesz • yesterday at 10:09 PM

Perturbation of dataset used for training can introduce adversarial behavior even without adding any other data, and idea is quite simple: you take two batches from the dataset for training and select model with more probable adversarial behavior. The more batches with posterior selection get processed, the more probable adversarial behavior become.

By determining if model gets better or not on a given benchmark, OpenAI selects models against benchmarks, implicitly using them in the training.

alt Hacker News

Replies