logoalt Hacker News

tehjokertoday at 12:57 AM0 repliesview on HN

If I read that right, the "jailbreak" is to ask the model to fix the codebase and then it exposes the flaws? That sounds like a gap that is nearly impossible to fix while retaining high capability. Like you want it to be able to fix your codebase...