This seems like the "obvious" solution. Why was the rejected?
EDIT: It appears to be an objection to GPU programming entirely.