logoalt Hacker News

raincolelast Tuesday at 1:41 AM5 repliesview on HN

What example do you need? In every single benchmark AI is getting better and better.

Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?


Replies

jacobsenscottlast Tuesday at 1:51 AM

Whenever I try and use a "state of the art" LLM to generate code it takes longer to get a worse result than if I just wrote the code myself from the start. That's the experience of every good dev I know. So that's my benchmark. AI benchmarks are BS marketing gimmicks designed to give the appearance of progress - there are tremendous perverse financial incentives.

This will never change because you can only use an LLM to generate code (or any other type of output) you already know how to produce and are expert at - because you can never trust the output.

show 2 replies
azemetrelast Tuesday at 10:56 PM

What metrics, that aren't controlled by industry, show AI getting better? Generally curious because those "ranking sites" to me seem to be infested with venture capital, so hardly fair or unbiased. The only reports I hear from academia are those being overly negative on AI.

fzeroracerlast Tuesday at 7:09 AM

AI is getting better at every benchmark. Please ignore that we're not allowed to see these benchmarks and also ignore that the companies in question are creating the benchmarks that are being exceeded.

philipwhiuklast Tuesday at 2:52 AM

OpenAI net profit.

The figures for cost are wildly off to start with.

bluefirebrandlast Tuesday at 1:52 AM

> please name what metric you think is meaningful

Job satisfaction and human flourishing

By those metrics, AI is getting worse and worse

show 1 reply