I was doing some experiments with removing top 100-1000 most common English words from my prompts. My hypothesis was that common words are effectively noise to agents. Based on the first few trials I attempted, there was no discernible difference in output. Would love to compare results with caveman.
Caveat: I didn’t do enough testing to find the edge cases (eg, negation).
I literally just posted a blog on this. Some seemingly insignificant words are actually highly structural to the model. https://www.ruairidh.dev/blog/compressing-prompts-with-an-au...
Doesn't it just use more tokens in reasoning?
Yeah, when I'm writing code I try to avoid zeros and ones, since those are the most common bits, making them essentially noise