Great article but I'm confused on one thing.
The article claims steering only works in local models, but GitHub Copilot has a "steer with message" feature where I can course correct mid execution. I use it often.
I think these are different kinds of steering right? Agent steering probably inserts another user message between the harnesses own ping-pong between harness and the LLM.
- https://docs.github.com/en/copilot/how-tos/copilot-cli/use-c...
- https://docs.github.com/en/copilot/how-tos/copilot-sdk/use-c...
Different kind of steering, that's just injecting text into the model's natural language thinking output or something very similar. You can do a middle ground though by using Anthropic's NLA work to look at the natural language rendition of a model's activations at a particular layer, edit the text and convert it back into completely different activations.