Try the llama one instead. Seemed better than qwen for some reason
I tried llama70b too with the same task, the reasoning seemed more coherent, but it still wound up coming to very invalid conclusions using that reasoning and the output was even further from correct than qwen.
I tried llama70b too with the same task, the reasoning seemed more coherent, but it still wound up coming to very invalid conclusions using that reasoning and the output was even further from correct than qwen.