I think the trick is called "query expansion". You use an LLM to rewrite the query into a more verbose form, which can also include text from the chat context, and then you use that as the basis for the RAG lookup. Basically you use an LLM to give the RAG a better chance of having the query be similar to the resources.
Thanks for the answer! I think you are right, I've also heard of HYDE (Hypothetical answer generation), that makes an LLM encode a guess as the answer into the query, which may also improve the results.