Besides the cost you get the control, transparency and ability to identify small language models or LoRA you want to serve even more cost effective.