There's also the efficiency argument from new capability: even a tiny bit better weather foreca...

gwern • yesterday at 4:43 AM • 3 replies • view on HN

There's also the efficiency argument from new capability: even a tiny bit better weather forecast is highly economically valuable (and saves a lot of wasted energy) if it means that 1 city doesn't have to evacuate because of an erroneous hurricane forecast, say. But how much would it cost to do that with the rivals? I don't know but I would guess quite a lot.

And one of the biggest ironies of AI scaling is that where scaling succeeds the most in improving efficiency, we realize it the least, because we don't even think of it as an option. An example: a Transformer (or RNN) is not the only way to predict text. We have scaling laws for n-grams and text perplexity (most famously, from Jeff Dean et al at Google back in the 2000s), so you can actually ask the question, 'how much would I have to scale up n-grams to achieve the necessary perplexity for a useful code writer competitive with Claude Code, say?' This is a perfectly reasonable, well-defined question, as high-order n-grams could in theory write code without enough data and big enough lookup tables, and so it can be answered. The answer will look something like 'if we turned the whole earth into computronium, it still wouldn't be remotely enough'. The efficiency ratio is not 10:1 or 100:1 but closer to ∞:1. The efficiency gain is so big no one even thinks of it as an efficiency gain, because you just couldn't do it before using AI! You would have humans do it, or not do it at all.

Replies

hammock • yesterday at 4:57 PM

> even a tiny bit better weather forecast is highly economically valuable (and saves a lot of wasted energy) if it means that 1 city doesn't have to evacuate because of an erroneous hurricane forecast

Here is the NOAA on the improvements:

> 8% better predictions for track, and 10% better predictions for intensity, especially at longer forecast lead times — with overall improvements of four to five days.(1)

I’d love someone to explain what these measurements mean though. Does better track mean 8% narrower angle? Something else? Compared to what baseline?

And am I reading this right that that improvement is measured at the point 4-5 days out from landfall? What’s the typical lead time for calling an evacuation, more or less than four days?

(1)https://www.noaa.gov/news/new-noaa-system-ushers-in-next-gen...

inciampati • yesterday at 4:58 PM

To have a competitive code writer with ngrams you need more than to "scale up the ngrams" you need to have a corpus that includes all possible codes that someone would want to write. And at that point you'd be better off with a lossless full text index like an r-index. But, the lack of any generalizability in this approach, coupled with its markovian features, will make this kind of model extremely brittle. Although, it would be efficient. You just need to somehow compute all possible language before hand. tldr; language models really are reasoning and generalizing over the domain they're trained on.

kingkawn • yesterday at 2:21 PM

Now that we’ve saved infinite energy all carbon tax credit markets are unnecessary! Big win for the climate! pollutes

alt Hacker News

Replies