logoalt Hacker News

sfinkyesterday at 10:06 PM0 repliesview on HN

My guess? Require them to not do the reinforcement learning on a custom model that implements guardrails. I think Anthropic has some of this built in already and couldn't alter it without retraining, but there's tons more layered on top.