You need a way to test model changes regardless as models in the same family change. Is it really a heavier lift to test different model families than it is to test going from GPT 3.5 to GPT 5 or even as you modify your prompts?
no, i dont think it's a heavier lift to test different model families. my point was that swapping models, whether that's to different model families or to new versions in the same model family, isn't straightforward. i'm reluctant to both upgrade model versions AND to swap model families, and that in itself is a type of stickiness that multiple model providers have.
maybe another way of saying the same thing is that there is still a lot of work to make eval tooling a lot better!
no, i dont think it's a heavier lift to test different model families. my point was that swapping models, whether that's to different model families or to new versions in the same model family, isn't straightforward. i'm reluctant to both upgrade model versions AND to swap model families, and that in itself is a type of stickiness that multiple model providers have.
maybe another way of saying the same thing is that there is still a lot of work to make eval tooling a lot better!