Ive seen claude 4 do this too when its context has lots of teats already and tool calling imho the...

tough • last Wednesday at 2:57 PM • 0 replies • view on HN

Ive seen claude 4 do this too when its context has lots of teats already and tool calling

imho the main issue is an llm no has real sense of what’s a real tool call vs just a log of it, the text logs are virtually identical, ao the Llm starts also predicting these inatrad of calling the tool to run tests

its kinda funny

alt Hacker News