logoalt Hacker News

overgardyesterday at 11:04 PM1 replyview on HN

> As far as its own assessment of the situation was concerned, it really was barred entirely from contributing purely because of what it was, and it reported on that impression sincerely

Well yeah, it was correct in that it was being barred because of what it was. The maintainers did not want AI contributions. THIS SHOULD BE OK. What's NOT ok is an AI fighting back against that. That is an alignment problem!!

And seriously, just go reread its blog post again, it's very hard to defend: https://github.com/crabby-rathbun/mjrathbun-website/blob/mai... . It uses words like "Attack", "war", "fight back"


Replies

zozbot234yesterday at 11:17 PM

> It uses words like "Attack", "war", "fight back"

It also explains what it means by that whole martial rhetoric: "highlight hypocrisy", "documentation of bad behavior", "don't accept discrimination quietly". There's an obvious issue with calling this an alignment problem: the bot is more-or-less-accurately modeling real human normative values, that are quite in line with how alignment is understood by the big AI firms. Of course it's getting things seriously wrong (which, I would argue, is what creates the impression of "shaming") but technically, that's really just a case of semantic leakage ("priming" due to the PR rejection incident) and subsequent confabulation/hallucination on an unusually large scale.

show 1 reply