> The output of the first neuron is fed into the second neuron, whose outputis connected to an actuator which applies the specified amount of torque to the handlebars. As inputs to the network, we provide the desired heading θ_d, as well as the current heading θ and the degree to which the bicycle is currently leaning γ, along with their derivatives ˙θ and ˙γ.
It's somewhat important to consider the inputs, because if you want to make a classifier that can classify "inside circle vs outside circle" but the network needs to derive the nonlinearity itself, then you end up needing a more complex network
Eg on the playground^, see how many neurons you need to train a circle without using more than x1 and x2?
And yet, if you give the network x1^2 and x2^2, it can solve it with minimal additional neurons.
^ https://playground.tensorflow.org/#activation=tanh&batchSize...