Good point. Will publish in the next version also the results with a prompt that allows the models to "think out loud" before providing the final verdict.