Does the training process ensure that all the intermediate steps remain interepretable, even on larg...

the8472 • last Saturday at 11:39 PM • 1 reply • view on HN

Does the training process ensure that all the intermediate steps remain interepretable, even on larger models? Not that we end up with some alien gibberish in all but the final step.

Replies

oofbey • yesterday at 3:39 AM

Training doesn’t encourage the intermediate steps to be interpretable. But they are still in the same token vocabulary space, so you could decode them. But they’ll probably be wrong.

➕ show 1 reply

alt Hacker News

Replies