logoalt Hacker News

gopher_spaceyesterday at 11:41 PM1 replyview on HN

This is software development, not sales. We rely on our tooling.

If I’m using a calculator to verify my math, I don’t want to use a second calculator to verify the first one.


Replies

stale2002today at 3:19 AM

I am sorry to be the one to tell you but it was already the case that you cannot trust LLMs to solve all your problems 100% of the time.

It was always random. This is no different than any other randomness that already exists in LLMS.

If you are concerned just do benchmarks and see if it is valuable for your usecase regardless.