But within the surprising words, the adjacent tokens are common. I can see an argument for having fe...

brookst • today at 1:06 PM • 0 replies • view on HN

But within the surprising words, the adjacent tokens are common. I can see an argument for having fewer transcription errors on badger-yellow-alternate than 0B9A26F3C74D.

Your test with small models makes tons of sense. Would be interesting to graph to two approaches against model size and recency.

alt Hacker News