That's like saying that if an LLM can function by being trailed on 10B words, it can work by being trained on 10k words.