They can post-train the model on usage of their specific tool along with the specific prompt they're using.
LLMs generalize obviously, but I also wouldn't be shocked if it performs better than a "normal" implementation.