I'm pretty sure that any world model that is inherently incapable of "bad outputs" wo...

int_19h • last Sunday at 11:13 PM • 0 replies • view on HN

I'm pretty sure that any world model that is inherently incapable of "bad outputs" would be too castrated in general to the point where it'd be actively detrimental to overall model quality. Even as it is, with RLHF "alignment", we already know that it has a noticeable downwards effect on raw scores.

alt Hacker News