First of all, I don't want to run anyone's code without proper explanation, so help me understand this. Let's start with the verifier. The 3rd party verifier receives a bundle, not knowing what the content is, not having access to the tool used to measure, and just run a single command based on the bundle which presumably contains expected results and actual measurements, both of which can easily be tampered. What good does that solve?
Right question. Bundle alone proves nothing — you're correct.
Two things make it non-trivial to fake:
The pipeline is public. You can read scripts/steward_audit.py before running anything. It's not a black box.
For materials claims — the expected value isn't in the bundle. Young's modulus for aluminium is ~70 GPa. Not my number. Physics. The verifier checks against that, not against something I provided.
ML and pipelines — provenance only, no physical grounding. Said so in known_faults.yaml :: SCOPE_001.