Why are we drawing a difference between "prompt" and "context" exactly? The linked article is a bit of puffery that redefines a commonly-used term - "context" - to mean something different than what it's meant so far when we discuss "context windows." It seems to just be some puffery to generate new hype.
When you play with the APIs the prompt/context all blurs together into just stuff that goes into the text fed to the model to produce text. Like when you build your own basic chatbot UI and realize you're sending the whole transcript along with every step. Using the terms from the article, that's "State/History." Then "RAG" and "Long term memory" are ways of working around the limits of context window size and the tendency of models to lose the plot after a huge number of tokens, to help make more effective prompts. "Available tools" info also falls squarely in the "prompt engineering" category.
The reason prompt engineering is going the way of the dodo is because tools are doing more of the drudgery to make a good prompt themselves. E.g., finding relevant parts of a codebase. They do this with a combination of chaining multiple calls to a model together to progressively build up a "final" prompt plus various other less-LLM-native approaches (like plain old "find").
So yeah, if you want to build a useful LLM-based tool for users you have to write software to generate good prompts. But... it ain't really different than prompt engineering other than reducing the end user's need to do it manually.
It's less that we've made the AI better and more that we've made better user interfaces than just-plain-chat. A chat interface on a tool that can read your code can do more, more quickly, than one that relies on you selecting all the relevant snippets. A visual diff inside of a code editor is easier to read than a markdown-based rendering of the same in a chat transcript. Etc.
One crucial difference between prompt and the context: the prompt is just content that is provided by a user. The context also includes text that was output by the bot - in conversational interfaces the context incorporates the system prompt, then the user's first prompt, the LLMs reply, the user's next prompt and so-on.
> Why are we drawing a difference between "prompt" and "context" exactly?
Because they’re different things? The prompt doesn’t dynamically change. The context changes all the time.
I’ll admit that you can just call it all ‘context’ or ‘prompt’ if you want, because it’s essentially a large chunk of text. But it’s convenient to be able to distinguish between the two so you can talk about the same thing.
Because the author is artifically shrinking the scope of one thing (prompt engineering) to make its replacement look better (context engineering).
Never mind that prompt engineering goes back to pure LLMs before ChatGPT was released (i.e. before the conversation paradigm was even the dominant one for LLMs), and includes anything from few-shot prompting (including question-answer pairs), providing tool definitions and examples, retrieval augmented generation, and conversation history manipulation. In academic writing, LLMs are often defined as a distribution P(y|x) where X is not infrequently referred to as the prompt. In other words, anything that comes before the output is considered the prompt.
But if you narrow the definition of "prompt" down to "user instruction", then you get to ignore all the work that's come before and talk up the new thing.