logoalt Hacker News

hippo22today at 3:06 PM0 repliesview on HN

Couldn’t this be solved by replacing the tokenized input with a model that outputs the tokenization and then training the entire thing as one larger model? The goal would be to make tokenization a function of the model input.