logoalt Hacker News

refulgentistoday at 12:36 AM1 replyview on HN

This isn't best practice. It's certainly not industry best practice. It would fail some pretty basic tests, like these, resulting in poor UX and poor reviews. There’s plenty of half-assed things labelled agent that do so, of course.

I think it describes generally how we can picture Claude and OpenAI working, but neglects further implementation details that are hard to see from their blog posts, ex. a web search vs. a web get tool.

(source: maintained a multi-provider x llama.cpp LLM client for 2.5+ years and counting)


Replies

dacharyctoday at 1:11 AM

Yeah, my colleague and I have been seeing in testing how much this is actually a problem in practice. It has been - surprising, and a little dismaying - how much this negatively impacts content retrieval and results in poor UX.