quite surprised to see SAC, considering the deepmind ping pong paper resorted to evolutionary strategies, iirc