You don't necessarily, but each token costs money for the AI to spit out. And probably more money when that output is used as input later. Delegating to a library makes sense financially.
With local inference on pretty decent local models we have nowadays (Qwen-3.5 and better) it's not much of a concern anymore.
With local inference on pretty decent local models we have nowadays (Qwen-3.5 and better) it's not much of a concern anymore.