I don't see how "they improved the models" is related to the bitter lesson. You are still injecting human-level expertise (whether it is by prompts or a structured API) to compensate for the model's failures. A "bitter lesson" would be that the model can do better without any injection, but more compute power, than it could with human interference.
> A "bitter lesson" would be that the model can do better without any injection, but more compute power, than it could with human interference.
This is what I expected the post to be about before clicking.