Always curious about which llms perform best in specific scenarios, so I built a local desktop app to benchmark and evaluate prompts and llms side by side.
[flagged]
[flagged]