Was the choice of such a small model driven by a desire for high tok/sec? I ask because an m4 pro 48gb machine can run larger models (if model intelligence is the thing that would make it more useful).
Yes that was my goal. Also noticed a huge performance gain going from ollama to mlx. Your mileage may vary.
Yes that was my goal. Also noticed a huge performance gain going from ollama to mlx. Your mileage may vary.