There's something off with this because Haiku should not be that good.
The hallucination benchmark is hallucinating
The hallucination benchmark is hallucinating