I'm continuously surprised that some people get good results out of GPT models. They sort of fail on my personal benchmarks for me.
Maybe GPT needs a different approach to prompting? (as compared to eg Claude, Gemini, or Kimi)
They are all gpt as in generative pre-trained transformer
They are all gpt as in generative pre-trained transformer