I literally just went through this yesterday. Had a few failing tests in an unfamiliar domain. Took a cursory look, couldn't figure it out. Pasted the error messages into Claude to see if it could speed things up for me. We went back and forth for a while, pulling on different threads and trying various things. In the end, it gave up and essentially relaxed the test's assertions to make it pass.
I wasn't happy with that outcome, so I decided to invest some time in debugging through the test, tracing the flow of data, looking at the state of the stack frames and finally figured out what was wrong -- the solution was so simple and so obvious that had I just given the effort up front, it would have saved me some time and tokens.
It's a valuable lesson to take to heart. I think it's better to go from tinkering and trying it out yourself, than to go straight to AI and then giving it a go independently.