logoalt Hacker News

albert_etoday at 4:11 AM2 repliesview on HN

If a tiny misconfiguration of reward system can cause such noticeable annoyance ...

What dangers lurk beneath the surface.

This is not funny.


Replies

andaitoday at 4:19 AM

For every gremlin spotted, many remain unseen...

TychoCelchuuutoday at 5:04 AM

This is a worry that people have been talking about in various forms for a while now, and I think it's a gigantic one. The only reason this was caught is that the quirk was a very noticeable verbal one. When words like "goblin" and "gremlin" pop up it is easy for us to spot. If the quirk takes another shape (say, ranking certain people with certain features as less trustworthy) it might be too subtle or too weird for us to notice it. Would I ever notice if ChatGPT consistently rates people born in June to be untrustworthy?

Here is an academic paper discussing this kind of worry: https://link.springer.com/article/10.1007/s11023-022-09605-x