I’m one of main devs of GitHub MCP (opinions my own) and I’ve really enjoyed your talks on the subject. I hope we can chat in-person some time.
I am personally very happy for our GH MCP Server to be your example. The conversations you are inspiring are extremely important. Given the GH MCP server can trivially be locked down to mitigate the risks of the lethal trifecta I also hope people realise that and don’t think they cannot use it safely.
“Unless you can prove otherwise” is definitely the load bearing phrase above.
I will say The Lethal Trifecta is a very catchy name, but it also directly overlaps with the trifecta of utility and you can’t simply exclude any of the three without negatively impacting utility like all security/privacy trade-offs. Awareness of the risks is incredibly important, but not everyone should/would choose complete caution. An example being working on a private codebase, and wanting GH MCP to search for an issue from a lib you use that has a bug. You risk prompt injection by doing so, but your agent cannot easily complete your tasks otherwise (without manual intervention). It’s not clear to me that all users should choose to make the manual step to avoid the potential risk. I expect the specific user context matters a lot here.
User comfort level must depend on the level of autonomy/oversight of the agentic tool in question as well as personal risk profile etc.
Here are two contrasting uses of GH MCP with wildly different risk profiles:
- GitHub Coding Agent has high autonomy (although good oversight) and it natively uses the GH MCP in read only mode, with an individual repo scoped token and additional mitigations. The risks are too high otherwise, and finding out after the fact is too risky, so it is extremely locked down by default.
In contrast, by if you install the GH MCP into copilot agent mode in VS Code with default settings, you are technically vulnerable to lethal trifecta as you mention but the user can scrutinise effectively in real time, with user in the loop on every write action by default etc.
I know I personally feel comfortable using a less restrictive token in the VS Code context and simply inspecting tool call payloads etc. and maintaining the human in the loop setting.
Users running full yolo mode/fully autonomous contexts should definitely heed your words and lock it down.
As it happens I am also working (at a variety of levels in the agent/MCP stack) on some mitigations for data privacy, token scanning etc. because we clearly all need to do better while at the same time trying to preserve more utility than complete avoidance of the lethal trifecta can achieve.
Anyway, as I said above I found your talks super interesting and insightful and I am still reflecting on what this means for MCP.
Thank you!
I've been thinking a lot about this recently. I've started running Claude Code and GitHub Copilot Agent and Codex-CLI in YOLO mode (no approvals needed) a bit recently because wow it's so much more productive, but I'm very aware that doing so opens me up to very real prompt injection risks.
So I've been trying to figure out the best shape for running that. I think it comes down to running in a fresh container with source code that I don't mind being stolen (easy for me, most of my stuff is open source) and being very careful about exposing secrets to it.
I'm comfortable sharing a secret with a spending limit: an OpenAI token that can only spend up to $25 is something I'm willing risking to an insecured coding agent.
Likewise, for Fly.io experiments I created a dedicated scratchpad "Organization" with a spending limit - that way I can have Claude Code fire up Fly Machines to test out different configuration ideas without any risk of it spending money or damaging my production infrastructure.
The moment code theft genuinely matters things get a lot harder. OpenAI's hosted Codex product has a way to lock down internet access to just a specific list of domains to help avoid exfiltration which is sensible but somewhat risky (thanks to open proxy risks etc).
I'm taking the position that if we assume that malicious tokens can drive the coding agent to do anything, what's an environment we can run in where the damage is low enough that I don't mind the risk?