> A system prompt is just a prompt that runs every time. Tool calls are just patterns that are fine tuned into place so that we can parse specific types of LLM output with traditional software. Agents are just a LLM REPL with a context specific system prompt, and limited ability to execute commands
pulling the covers back so hard and so fast is going to be shocking for some.
To make it more concrete you can try and build something yourself. Grab a small model off of hugging face that you can run locally. Then put a rest API in front of it so you can make a request with curl, send in some text, and get back in the response what the llm returned. Now in the API prepend some text to what came on the request ( this is your system prompt ) like "you are an expert programmer, be brief and concise when answering the following". Now add a session to your API and include the past 5 requests from the same user along with the new one when passing to the llm. Update your prepended text (the system prompt) with "consider the first 5 requests/responses when formulating your response to the question". you can see where this is going, all of the tools and agents are some combination of the above and/or even adding more than one model.
At the end of the day, everyone has a LLM at the core predicting and outputting the next most likely string of characters that would follow from an input string of characters.