What John Carmack is exploring is pretty revealing. Train models to play 2D video games to a superhu...

EternalFury • yesterday at 6:22 PM • 15 replies • view on HN

What John Carmack is exploring is pretty revealing. Train models to play 2D video games to a superhuman level, then ask them to play a level they have not seen before or another 2D video game they have not seen before. The transfer function is negative. So, in my definition, no intelligence has been developed, only expertise in a narrow set of tasks.

It’s apparently much easier to scare the masses with visions of ASI, than to build a general intelligence that can pick up a new 2D video game faster than a human being.

Replies

ozgrakkurt • today at 1:53 AM

Seeing comments here saying “this problem is already solved”, “he is just bad at this” etc. feels bad. He has given a long time to this problem by now. He is trying to solve this to advance the field. And needless to say, he is a legend in computer engineering or w/e you call it.

It should be required to point to the “solution” and maybe how it works to say “he just sucks” or “this was solved before”.

IMO the problem with current models is that they don’t learn categorically like: lions are animals, animals are alive. goats are animals, goats are alive too. So if lions have some property like breathing and goats also have it, it is likely that other similar things have the same property.

Or when playing a game, a human can come up with a strategy like: I’ll level this ability and lean on it for starting, then I’ll level this other ability that takes more time to ramp up while using the first one, then change to this play style after I have the new ability ready. This might be formulated completely based on theoretical ideas about the game, and modified as the player gets more experience.

With current AI models as far as I can understand, it will see the whole game as an optimization problem and try to find something at random that makes it win more. This is not as scalable as combining theory and experience in the way that humans do. For example a human is innately capable of understanding there is a concept of early game, and the gains made in early game can compound and generate a large lead. This is pattern matching as well but it is on a higher level .

Theory makes learning more scalable compared to just trying everything and seeing what works

➕ show 2 replies

vladimirralev • yesterday at 6:56 PM

He is not using appropriate models for this conclusion and neither is he using state of the art models in this research and moreover he doesn't have an expensive foundational model to build upon for 2d games. It's just a fun project.

A serious attempt at video/vision would involve some probabilistic latent space that can be noised in ways that make sense for games in general. I think veo3 proves that ai can generalize 2d and even 3d games, generating a video under prompt constraints is basically playing a game. I think you could prompt veo3 to play any game for a few seconds and it will generally make sense even though it is not fine tuned.

➕ show 7 replies

YokoZar • yesterday at 6:44 PM

I wonder if this is a case of overfitting from allowing the model to grow too large, and if you might cajole it into learning more generic heuristics by putting some constraints on it.

It sounds like the "best" AI without constraint would just be something like a replay of a record speedrun rather than a smaller set of heuristics of getting through a game, though the latter is clearly much more important with unseen content.

smokel • yesterday at 6:59 PM

The subject you are referring to is most likely Meta-Reinforcement Learning [1]. It is great that John Carmack is looking into this, but it is not a new field of research.

[1] https://instadeep.com/2021/10/a-simple-introduction-to-meta-...

justanotherjoe • yesterday at 7:38 PM

I don't get why people are so invested in framing it this way. I'm sure there are ways to do the stated objective. John Carmack isn't even an AI guy why is he suddenly the standard.

➕ show 8 replies

Uehreka • yesterday at 9:26 PM

These questions of whether the model is “really intelligent” or whatever might be of interest to academics theorizing about AGI, but to the vast swaths of people getting useful stuff out of LLMs, it doesn’t really matter. We don’t care if the current path leads to AGI. If the line stopped at Claude 4 I’d still keep using it.

And like I get it, it’s fun to complain about the obnoxious and irrational AGI people. But the discussion about how people are using these things in their everyday lives is way more interesting.

bthornbury • yesterday at 11:26 PM

This generalization issue in RL in specific was detailed by OpenAI in 2018

https://arxiv.org/pdf/1804.03720

ferguess_k • yesterday at 6:41 PM

Can you please explain "the transfer function is negative"?

I'm wondering whether one has tested with the same model but on two situations:

1) Bring it to superhuman level in game A and then present game B, which is similar to A, to it.

2) Present B to it without presenting A.

If 1) is not significantly better than 2) then maybe it is not carrying much "knowledge", or maybe we simply did not program it correctly.

➕ show 2 replies

fullshark • yesterday at 8:49 PM

Just sounds like an example of overfitting. This is all machine learning at its root.

SquibblesRedux • today at 3:38 AM

Indeed, it's nothing but function fitting.

goatlover • yesterday at 8:28 PM

I've wondered about the claim that the models played those Atari/2D video games at superhuman levels, because I clearly recall some humans achieving superhuman levels before models were capable of it. Must have been superhuman compared to average human player, not someone who spent an inordinate amount of time mastering the game.

➕ show 1 reply

hluska • yesterday at 9:10 PM

When I finished my degree, the idea that a software system could develop that level of expertise was relegated to science fiction. It is an unbelievable human accomplishment to get to that point and honestly, a bit of awe makes life more pleasant.

Less quality of life focused, I don’t believe that the models he uses for this research are capable of more. Is it really that revealing?

moralestapia • yesterday at 6:51 PM

I wonder how much performance decreases if they just use slightly modified versions of the same game. Like a different color scheme, or a couple different sprites.

t55 • yesterday at 7:23 PM

this is what deepmind did 10 years ago lol

➕ show 1 reply

alt Hacker News

Replies