What example do you need? In every single benchmark AI is getting better and better.
Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?
What metrics, that aren't controlled by industry, show AI getting better? Generally curious because those "ranking sites" to me seem to be infested with venture capital, so hardly fair or unbiased. The only reports I hear from academia are those being overly negative on AI.
AI is getting better at every benchmark. Please ignore that we're not allowed to see these benchmarks and also ignore that the companies in question are creating the benchmarks that are being exceeded.
OpenAI net profit.
The figures for cost are wildly off to start with.
> please name what metric you think is meaningful
Job satisfaction and human flourishing
By those metrics, AI is getting worse and worse
Whenever I try and use a "state of the art" LLM to generate code it takes longer to get a worse result than if I just wrote the code myself from the start. That's the experience of every good dev I know. So that's my benchmark. AI benchmarks are BS marketing gimmicks designed to give the appearance of progress - there are tremendous perverse financial incentives.
This will never change because you can only use an LLM to generate code (or any other type of output) you already know how to produce and are expert at - because you can never trust the output.