logoalt Hacker News

mvkelyesterday at 10:43 PM2 repliesview on HN

Weirdly, LLMs seem to break with these instructions. They simply ignore them, almost as if the pretraining/RL weights are so heavy, no amount of system prompting can override it


Replies

RandomWorkeryesterday at 10:56 PM

It's a beauty. We can easily detect the issues with Youtubers that generate scripts from this tool. I've noticed these tropes, after 30 seconds, remove, block, and do not recommend any further. I hope to train the algorithm to detect AI scripts and stop recommending me those videos. It's honestly turned me off from YouTube so much, or I find myself going to my "subscribed" tab and going to content creators that still believe in the craft.

show 1 reply
duskwufftoday at 1:03 AM

IIRC, it's well documented that negative instructions tend to be ineffective - possibly through some sort of LLM analogue to the "pink elephant paradox", or simply because the language models are unable to recognize clichés until they've already been generated.

show 1 reply