Thanks for sharing this — really solid write-up, and I agree with the core premise. Pickle is a huge...

anky8998 • last Wednesday at 4:52 AM • 1 reply • view on HN

Thanks for sharing this — really solid write-up, and I agree with the core premise. Pickle is a huge blind spot in ML security, and most folks don’t realize that torch.load() is effectively executing attacker-controlled bytecode.

One thing we ran into while working on similar problems is that static opcode scanning alone tends to give a false sense of coverage. A lot of real-world bypasses don’t rely on obvious GLOBAL os.system patterns and can evade tools that depend on pickletools, modelscan, or fickling.

We recently open-sourced a structure-aware pickle fuzzer at Cisco that’s designed specifically to test the robustness of pickle scanners, not just scan models:

• It executes pickle bytecode inside a custom VM, tracking opcode execution, stack state, and memo behavior • Mutates opcode sequences, stack interactions, and protocol-specific edge cases • Has already uncovered multiple scanner bypasses that look benign statically but behave differently at runtime

Repo: https://github.com/cisco-ai-defense/pickle-fuzzer

We also wrote up some of the lessons learned while hardening pickle scanners here (including why certain opcode patterns are tricky to reason about statically): https://blogs.cisco.com/ai/hardening-pickle-file-scanners

I think tools like AIsbom are a great step forward, especially for SBOM and ecosystem visibility. From our experience, pairing static analysis + fuzzing-driven adversarial testing is where things get much more resilient over time.

Replies

lab700xdev • yesterday at 3:25 AM

This is incredibly valuable feedback. I’ve been reading through the pickle-fuzzer repo this morning, specifically about stack manipulation bypassing static heuristics. You nailed the trade-off: AIsbom is designed for the "90% hygiene" case in a fast CI/CD pipeline (where spinning up a VM/Fuzzer might be too heavy/slow for every commit). We aim to catch the low-hanging fruit (obvious RCE) and generate the Inventory (SBOM) rapidly. That said, moving toward an "Allowlist Only" (Strict Mode) approach seems like the better way to make static analysis resilient against the obfuscation you mentioned. We are prioritizing that for upcoming release. Would love to potentially reference your fuzzer in our docs as the "Deep Scan" alternative!

alt Hacker News

Replies