This is a really good observation and honestly one of the hardest problems I've hit too.
Cog doesn't use confidence scores (yet — you're making me think about it), but the nightly pipeline is basically a proxy for the same thing. The /reflect pass runs twice a day and does consistency sweeps — it reads canonical files and checks that every referencing file still agrees. When facts drift (and they do, constantly), it catches and fixes them. The reinforcement signal is implicit: things that keep coming up in conversations get promoted to hot memory, things that go quiet eventually get archived to "glacier" (cold storage, still retrievable but not loaded by default).
The closest thing to your contradictions log is probably the observations layer — raw timestamped events that never get edited or deleted. Threads (synthesis files) get rewritten freely, but the observations underneath are append-only. So when the AI's understanding changes, the old observations are still there as a paper trail.
Where I think you're ahead is making confidence explicit. My system handles staleness through freshness (timestamps, "as of" dates on entities, pipeline frequency) but doesn't distinguish between "I'm very sure about this" and "I inferred this once." That's a real gap. Would love to see what you're building — is it public?
yep it's public: https://github.com/rodspeed/epistemic-memory
The observations layer being append-only is smart, thats basically the same instinct as the tensions log. The raw data stays honest even when the interpretation changes.
The freshness approach and explicit confidence scores probably complement each other more than they compete. Freshness tells you when something was last touched, confidence tells you how much weight it deserved in the first place. A belief you inferred once three months ago should decay differently than one you confirmed across 20 sessions three months ago. Both are stale by timestamp but they're not the same kind of stale.