logoalt Hacker News

fooblasteryesterday at 3:07 PM2 repliesview on HN

what inference runtime are you using? You mentioned mlx but I didn't think anyone was using that for local llms


Replies

kamranjonyesterday at 3:48 PM

LM Studio (which prioritizes MLX models if you're on Mac and they are available) - I have it setup with tailscale running as a server on my personal laptop. So when I'm working I can connect to it from my work laptop, from wherever I might be, and it's integrated through the Zed editor using its built in agent - it's pretty seamless. Then whenever I want to use my personal laptop I just unload the model and do other things. It's a really nice setup, definitely happy I got the 128gb mbp because I do a lot of video editing and 3d rendering work as a hobby/for fun and it's sorta dual purpose in that way, I can take advantage of the compute power when I'm not actually on the machine by setting it up as a LLM server.

pramyesterday at 5:20 PM

LM Studio has had an MLX engine and models since 2024.