It seems like you're assuming that models are trying to predict the next token. Is that really ...

kevingadd • yesterday at 4:16 PM • 1 reply • view on HN

It seems like you're assuming that models are trying to predict the next token. Is that really how they work? I would have assumed that tokenization is an input-only measure, so you have perhaps up to 50k unique input tokens available, but output is raw text or synthesized speech or an image. The output is not tokens so there are no limitations on the output.

Replies

anonymoushn • yesterday at 4:32 PM

yes, in typical architectures for models dealing with text, the output is a token from the same vocabulary as the input.

alt Hacker News

Replies