Supporting TFA'd thesis that it's trained to be good at benchmarks.
Is it bad? It was trained on synthetic data with emphasis on coding and scientific thinking. Good on my opinion, that's what it can be used for. Not as universal do it all model.
Is it bad? It was trained on synthetic data with emphasis on coding and scientific thinking. Good on my opinion, that's what it can be used for. Not as universal do it all model.