But that objective measure is exactly what we’re lacking in programming: There is often many ways to skin a cat, but the model only takes one. Without knowing about those it didn’t take, how do you judge the quality of a new model?
I would say following instructions.
If Claude understood what you mean better without you having to over explain it would be an improvement
I would say following instructions.
If Claude understood what you mean better without you having to over explain it would be an improvement