I don't think Minecraft would be considered a cornerstone of optimal programming.
Minecraft is, and always has been, handling vast amounts of data at pretty good performance. It's not an impossibly difficult task, many other people have made voxel game engines which are better, but it's something you can't do without paying attention to these things. Every voxel engine with remotely reasonable performance needs to carefully count bits used per block.
The entire program doesn't need to be a cornerstone of optimal programming for this one example to hold true.
The 4 bit stuff is a hangover from Mojang having to squeeze every bit of perf from their Java based engine that they could. Their original sound engine was so sketchy that C418's (music composer) minimalist sound is partly because it really couldn't handle much more than what got released.
MS has been loosening up on the 4 bits limit and have created a CPP variant of Minecraft which performs better, but they've also introduced their unified login garbage that has almost made me give up Minecraft completely.