My biggest issue with Devstral and even their biggest model is that they’re dangerous unless closely...

tmikaeld • today at 3:45 PM • 1 reply • view on HN

My biggest issue with Devstral and even their biggest model is that they’re dangerous unless closely directed and reviewed and i mean CLOSELY. Unfortunately mistral models will believe and do anything.

See: https://petergpt.github.io/bullshit-benchmark/viewer/index.v...

See some of the test results, it’s horrifying

Replies

badsectoracula • today at 5:29 PM

FWIW personally i prefer this. When i tried Qwen3.6 and asked it a few questions, while it did respond, it was ADAMANT i should do something else when i really wanted an answer to the question i made. It felt like when you search something and a stackoverflow answer about what you search for comes up and the most upvoted answer is about using/doing something else - when you want a specific answer to that specific question, not something else.

Meanwhile Devstral Small 2 just answers the damn question.

I don't want to have to convince my computer to do what i want it to do, i want from it to do what i ask it to.

➕ show 1 reply

alt Hacker News

Replies