Ah, yes, of course the read bit should move into the exponent, since it's an input, not an output of the function. But the key point I was making is that there exists a formula. (I don't really care what the formula is.) The part I was not understanding was the complexity of "essentially different".
In this context, two syntactically-different TMs are considered "essentially the same" if all reachable states are the same up to reordering their labels (except for the fixed starting label A) and globally swapping the L/R directions.
The problem is knowing how many states are actually reachable, and how many are dead code. This is impossible to decide in general thanks to Rice's theorem and whatnot. In this case, it involves deciding all 4-state machines.