I built Understudy because a lot of real work still spans native desktop apps, browser tabs, terminals, and chat tools. Most current agents live in only one of those surfaces.
Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. The part I'm most interested in feedback on is teach-by-demonstration: you do a task once, the agent records screen video + semantic events, extracts the intent rather than coordinates, and turns it into a reusable skill.
Demo video: https://www.youtube.com/watch?v=3d5cRGnlb_0
In the demo I teach it: Google Image search -> download a photo -> remove background in Pixelmator Pro -> export -> send via Telegram. Then I ask it to do the same for Elon Musk. The replay isn't a brittle macro: the published skill stores intent steps, route options, and GUI hints only as a fallback. In this example it can also prefer faster routes when they are available instead of repeating every GUI step.
Current state: macOS only. Layers 1-2 are working today; Layers 3-4 are partial and still early.
npm install -g @understudy-ai/understudy
understudy wizard
GitHub: https://github.com/understudy-ai/understudyHappy to answer questions about the architecture, teach-by-demonstration, or the limits of the current implementation.
One more tool targeting OSX only. That platform is overserved with desktop agents already while others are underserved, especially Linux.
cool idea. good idea doing a demo as well.
[dead]
[dead]
2026 and we still pretend to not understand how llms work huh
The demonstration-based approach is interesting for the handoff problem. The hardest part of agentic automation isnt the first run -- its making the agent robust to the cases the demonstrator never showed it. How do you handle edge cases or failures mid-task? Does it fall back to asking the user, or does it have some recovery heuristic? Asking because we found that the failure mode surface matters more than happy-path coverage when you actually deploy these in production.