logoalt Hacker News

password-apptoday at 12:27 AM0 repliesview on HN

Impressive image quality improvements. Meanwhile, AI agents just crossed a milestone: Simular's Agent S hit 72.6% on OSWorld (human-level is 72.36%).

We're seeing AI get better at both creative tasks (images) and operational tasks (clicking through websites).

For anyone building AI agents: the security model is still the hard part. Prompt injection remains unsolved even with dedicated security LLMs.