Will add a human-labelled expected response and measure against it in a follow up research. This one...

kostaj • today at 2:09 PM • 0 replies • view on HN

Will add a human-labelled expected response and measure against it in a follow up research. This one only captures the disagreement between the models, but not which model is write/wrong.

alt Hacker News