But shouldn't you have picked words that also have single token representations for the word with a dash in front? Or are there less than 4096 such words? That would get your token count for the 10 word variant (the most honest benchmark) from 17 tokens to 10