logoalt Hacker News

mannanjtoday at 3:31 AM2 repliesview on HN

The article does mention this and a weakness of that approach is mentioned too.


Replies

crisnobletoday at 3:45 AM

Perhaps they asked AI to summarize the article for them and it stopped after the first "disregard that" it read into its context window.

wbecklertoday at 4:07 AM

The article didn't describe how the second AI is tuned to distrust input and scan it for "disregard that." Instead it showed an architecture where a second AI accepts input from a naively implemented firewall AI that isn't scanning for "disregard that"

show 1 reply