It could be trained to say that, but it's not exactly clear how you would reinforce the absence...

root_axis • 05/15/2025 • 2 replies • view on HN

It could be trained to say that, but it's not exactly clear how you would reinforce the absence of certain training data in order to emit that response accurately, rather than just based on embedding proximity.

Replies

simianwords • 05/15/2025

Why does it seem so hard to make training data for this? You can cook up a few thousands of training data and do an RLHF.

➕ show 1 reply

jsnider3 • 05/15/2025

Seems easy. Have a set of vague requests and train it to ask for clarification instead of guessing.

➕ show 3 replies

alt Hacker News

Replies