Isn't "computer use" just interaction with a shell-like environment, which is routine...

zozbot234 • yesterday at 9:18 PM • 5 replies • view on HN

Isn't "computer use" just interaction with a shell-like environment, which is routine for current agents?

Replies

No.

Computer use (to anthropic, as in the article) is an LLM controlling a computer via a video feed of the display, and controlling it with the mouse and keyboard.

➕ show 3 replies

michaelt • yesterday at 9:30 PM

> Almost every organization has software it can’t easily automate: specialized systems and tools built before modern interfaces like APIs existed. [...]

> hundreds of tasks across real software (Chrome, LibreOffice, VS Code, and more) running on a simulated computer. There are no special APIs or purpose-built connectors; the model sees the computer and interacts with it in much the same way a person would: clicking a (virtual) mouse and typing on a (virtual) keyboard.

https://www.anthropic.com/news/claude-sonnet-4-6

jpalepu • yesterday at 9:34 PM

Interesting question! In this context, "computer use" means the model is manipulating a full graphical interface, using a virtual mouse and keyboard to interact with applications (like Chrome or LibreOffice), rather than simply operating in a shell environment.

➕ show 1 reply

zmmmmm • yesterday at 9:30 PM

No their definition of "computer use" now means:

> where the model interacts with the GUI (graphical userinterface) directly.

lukev • yesterday at 10:08 PM

This is being downvoted but it shouldn't be.

If the ultimate goal is having a LLM control a computer, round-tripping through a UX designed for bipedal bags of meat with weird jelly-filled optical sensors is wildly inefficient.

Just stay in the computer! You're already there! Vision-driven computer use is a dead end.

➕ show 3 replies

alt Hacker News

Replies