> In the game example having the weights optimized for game A doesn't help with game B. It would be interesting to see if training for both game A and B help it understand concepts in both.
Supposedly it does both A and B worse. That's their problem statement essentially. Current SOTA models don't behave like humans would. If you took a human that's really good at A and B, chances are they're gonna pick up C much quicker than a random person off the street that hasn't even seen Atari before. With SOTA models, the random "person" does better at C than the A/B master.