The original GPT-3 was trained very differently than modern models like GPT-4. For example, the conversational structure of an assistant and user is now built into the models, whereas earlier versions were simply text completion models.
It's surprising that many people view the current AI and large language model advancements as a significant boost in raw intelligence. Instead, it appears to be driven by clever techniques (such as "thinking") and agents built on top of a foundation of simple text completion. Notably, the core text completion component itself hasn’t seen meaningful gains in efficiency or raw intelligence recently...