When dealing with stuff like php serialization and pickle, the rule is simple: never unpickle anything you didn't pickle yourself. If anything else could possibly touch the serialized bytes, sign it with HMAC and keep that somewhere untouchable.
I somehow doubt this tool is going to be able to pull off what Java bytecode verification could not.
The Golden Rule holds: "Don't unpickle untrusted data." The problem I'm trying to solve is that "Untrusted" has become blurry in the AI age. Data Scientists treat Model Hubs (like Hugging Face) as trusted repositories, similar to PyPI or NPM. They shouldn't, but they do. This tool effectively serves as a "Loud Warning Label" to break that assumption. It tells the engineer: "Hey, you think this is just weights, but I see socket calls in here. Do not load this."
> When dealing with stuff like php serialization and pickle, the rule is simple: never unpickle anything you didn't pickle yourself.
I thought the rule was, never use pickle, it makes no sense when other serialization formats exist and are just as easy to use