logoalt Hacker News

charcircuittoday at 4:50 AM1 replyview on HN

The issue is that you can't do unsupervised learning if you require humans.


Replies

rhdunntoday at 7:08 AM

LLMs grading the answers is relying on the LLM knowing the answer and not just hallucinating it. You also have issues if/when the model refuses to answer, or if it gets stuck in a loop (e.g. if running locally with a heavily quantized model).

I'm investigating/experimenting with using traditional NLP (stanza, spaCy, etc.) to try and grade the responses according to different metrics (is the response in first/second/third person?, is it written as poetry, prose, or drama? etc.). I'm also thinking about using information extraction and synonym detection to handle data queries and the like.

show 1 reply