This isn't best practice. It's certainly not industry best practice. It would fail some pretty basic tests, like these, resulting in poor UX and poor reviews. There’s plenty of half-assed things labelled agent that do so, of course.
I think it describes generally how we can picture Claude and OpenAI working, but neglects further implementation details that are hard to see from their blog posts, ex. a web search vs. a web get tool.
(source: maintained a multi-provider x llama.cpp LLM client for 2.5+ years and counting)
Yeah, my colleague and I have been seeing in testing how much this is actually a problem in practice. It has been - surprising, and a little dismaying - how much this negatively impacts content retrieval and results in poor UX.