logoalt Hacker News

hoosieree10/11/20241 replyview on HN

My classifier is not very accurate:

    is_trick(question)  # 50% accurate
To make the client happy, I improved it:

    is_trick(question, label)  # 100% accurate
But the client still isn't happy because if they already knew the label they wouldn't need the classifier!

...

If ChatGPT had "sense" your extra prompt should do nothing. The fact that adding the prompt changes the output should be a clue that nobody should ever trust an LLM anywhere correctness matters.

[edit]

I also tried the original question but followed-up with "is it possible that the doctor is the boy's father?"

ChatGPT said:

Yes, it's possible for the doctor to be the boy's father if there's a scenario where the boy has two fathers, such as being raised by a same-sex couple or having a biological father and a stepfather. The riddle primarily highlights the assumption about gender roles, but there are certainly other family dynamics that could make the statement true.


Replies

PoignardAzur10/11/2024

It's not like GP gave task-specific advice in their example. They just said "think carefully about this".

If it's all it takes, then maybe the problem isn't a lack of capabilities but a tendency to not surface them.

show 1 reply