Rapid MLX team has done some interesting benchmarking that suggests Qwopus 27B is pretty solid. Thei...

pbronez • today at 11:36 AM • 1 reply • view on HN

Rapid MLX team has done some interesting benchmarking that suggests Qwopus 27B is pretty solid. Their tool includes benchmarking features so you can evaluate your own setup.

They have a metric called Model-Harness Index:

MHI = 0.50 × ToolCalling + 0.30 × HumanEval + 0.20 × MMLU (scale 0-100)

https://github.com/raullenchai/Rapid-MLX

Replies

JumpCrisscross • today at 11:38 AM

Pardon the silly question, but why do I need this tool versus running the model directly (and SSH’ing in when I’m away from home)?

alt Hacker News

Replies