Current models don't yet use RLVR with self-play though, at least as far as we know. They use RLVR with large numbers of manually created RL environments.
But they will probably use self-play soon. See https://www.amplifypartners.com/blog-posts/self-play-and-aut...