>Many neural networks are effectively large decision trees in disguise and those are the ones which have potential with this kind of approach.
I don't see how that is true. Decision trees look at one parameter at a time and potentially split to multiple branches (aka more than 2 branches are possible). Single input -> discrete multi valued output.
Neural networks do the exact opposite. A neural network neuron takes multiple inputs and calculates a weighted sum, which is then fed into an activation function. That activation function produces a scalar value where low values mean inactive and high values mean active. Multiple inputs -> continuous binary output.
Quantization doesn't change anything about this. If you have a 1 bit parameter, that parameter doesn't perform any splitting, it merely decides whether a given parameter is used in the weighted sum or not. The weighted sum would still be performed with 16 bit or 8 bit activations.
I'm honestly tired of these terrible analogies that don't explain anything.
> I'm honestly tired of these terrible analogies that don't explain anything.
Well, step one should be trying to understand something instead of complaining :)
> Single input -> discrete multi valued output.
A single node in a decision tree is single input. The decision tree as a whole is not. Suppose you have a 28x28 image, each 'pixel' being eight bits wide. Your decision tree can query 28x28x8 possible inputs as a whole.
> A neural network neuron takes multiple inputs and calculates a weighted sum, which is then fed into an activation function.
Do not confuse the 'how' with 'what'.
You can train a neural network that, for example, tells you if the 28x28 image is darker at the top or darker at the bottom or has a dark band in the middle.
Can you think of a way to do this with a decision tree with reasonable accuracy?