In local setup you still usually want to split machine that runs inference from client that uses it, there are often non trivial resources used like chromium, compilation, databases etc involved that you don’t want to pollute inference machine with.