logoalt Hacker News

Show HN: Steerling-8B, a language model that can explain any token it generates

54 pointsby adebayojtoday at 12:38 AM7 commentsview on HN

Comments

brendanashworthtoday at 3:25 AM

Is there a reason people don't use SHAP [1] to interpret language models more often? The in-context attribution of outputs seems very similar.

[1] https://shap.readthedocs.io/en/latest/

show 1 reply
great_psytoday at 3:49 AM

Maybe I’m not creative enough to see the potential, but what value does this bring ?

Given the example I saw about CRISPR, what does this model give over a different, non explaining model in the output ? Does it really make me more confident in the output if I know the data came from Arxiv or Wikipedia ?

I find the LLM outputs are subtlety wrong not obviously wrong

show 1 reply
pbmangotoday at 3:24 AM

This is very interesting. I don't see much discussion of interpretability in day to the day discourse of AI builders. I wonder if everyone assumes it to either be solved, or to be too out of reach to bother stopping and thinking about.

rvztoday at 3:29 AM

Now this is something which is very interesting to see and might be the answer to the explainability issue with LLMs, which can unlock a lot more use-cases that are off limits.

We'll see.