You only need GPUs if you assume the training is gradient descent. GAs or anything else that can handle nonlinearities would be fine, and possibly fast enough to be interesting.