Feels like we’re repeating classic distributed systems lessons: assume failure, constrain blast radiusand never trust components that can’t explain themselves reliably
Exactly assuming failure and constraining the blast radius feels like the only reliable path when the models themselves are black boxes. Patch based alignment starts looking fragile pretty quickly
Exactly assuming failure and constraining the blast radius feels like the only reliable path when the models themselves are black boxes. Patch based alignment starts looking fragile pretty quickly