This is a really interesting take on the sandboxing problem. This reminds me of an experiment I worked on a while back (https://github.com/imfing/jsrun), which embedded V8 into Python to allow running JavaScript with tightly controlled access to the host environment. Similar in goal to run untrusted code in Python.
I’m especially curious about where the Pydantic team wants to take Monty. The minimal-interpreter approach feels like a good starting point for AI workloads, but the long tail of Python semantics is brutal. There is a trade-off between keeping the surface area small (for security and predictability) and providing sufficient language capabilities to handle non-trivial snippets that LLMs generate to do complex tasks
there’s no way around VMs for secure, untrusted workloads. everything else, like Monty has too many tradeoffs that makes it non-viable for any real workloads
disclaimer: i work at E2B, opinions my own
Can't be sure where this might end, but the primary goal is to enable codemode/programmatic tool calling, using the external function call mechanism for anything more complicated.
I think in the near term we'll add support for classes, dataclasses, datetime, json. I think that should be enough for many use cases.