Just curious - if we remove stop words from prompts before going to LLM, wouldn't it reduce token size? Will it keep the response from LLM same (original vs without stop tokens)?
Don't know, but GPT-5 Thinking strips out a lot of words in its reasoning trace in order to save tokens. Someone on Twitter jailbroke it in order to get the original CoT traces.
Search engines can afford to throw out stopwords because they're often keyword based. But (frontier) LLM's need the nuance and semantics they signal -- they don't automatically strip them. There are probably special purpose models that do this, or in certain parts of a RAG pipeline, but that's the exception.
Yeah, it'll be less input tokens if you omitted them yourself. It's not guaranteed to keep the response the same, though. You're asking the model to work with less context and more ambiguity at that point. So stripping your prompt of stopwords is going to save you negligible $ and potentially cost a lot in model performance.