> So really, how important is computing the exact gradient using calculus, vs just knowing the general direction to step? Would that be cheaper to calculate than full derivatives?
Yes, absolutely -- a lot of ideas inspired by this have been explored in the field of optimization, and also in machine learning. The very idea of "stochastic" gradient descent using mini-batches basically a cheap (hardware compatible) approximation to the gradient for each step.
For a relatively extreme example of how we might circumvent the computational effort of backprop, see Direct Feedback Alignment: https://towardsdatascience.com/feedback-alignment-methods-7e...
Ben Recht has an interesting survey of how various learning algorithms used in reinforcement learning relate with techniques in optimization (and how they each play with the gradient in different ways): https://people.eecs.berkeley.edu/~brecht/l2c-icml2018/ (there's nothing special about RL... as far as optimization is concerned, the concepts work the same even when all the data is given up front rather than generated on-the-fly based on interactions with the environment)