Not sure if that counts as lying but I've heard that an ML model (way before all this GPT LLM s...

Neywiny • last Friday at 4:31 PM • 2 replies • view on HN

Not sure if that counts as lying but I've heard that an ML model (way before all this GPT LLM stuff) learned to classify images based on the text that was written. For an obfuscated example, it learned to read "stop", "arrêt", "alto", etc. on a stop sign instead of recognizing the red octagon with white letters. Which naturally does not work when the actual dataset has different text.

Replies

Jon_Lowtek • last Friday at 4:50 PM

typographic attacks against vision-language models are still a thing with more recent models like GPT4-V: https://arxiv.org/abs/2402.00626

catigula • last Friday at 4:45 PM

That does feel a little more like over-fitting, but you might be able to argue that there's some philosophical proximity to lying.

I think, largely, the

  Pre-training -> Post-training -> Safety/Alignment training

pipeline would obviously produce 'lying'. The trainings are in a sort of mutual dissonance.

alt Hacker News

Replies