This stuff dates way, way before 2004.
For non-power of two, just checked our own very old circular byte buffer library code and using the notation from this article, it is:
entriesAllocated() { return ((wrPtr-rdPtr+2*bufSize) % (2*bufSize)); }
remainingSpace() { return bufSize - entriesAllocated(); }
isEmpty() { return (entriesAllocated()==0); }
isFull() { return (entriesAllocated()==bufSize); }
incWr(int n) { wrPtr = (wrPtr+n) % (2*bufSize); }
incRd(int n) { rdPtr = (rdPtr+n) % (2*bufSize); }
The 2*bufSize gives you an extra bit (beyond representing bufSize) that lets you disambiguate empty vs full. And if it is a constant power of two (e.g. via C++ template), then you can see how this just compiles into a bitmask instead, like the author's version. You read and write the buffer at (rdPtr%bufSize) and (wrPtr%bufSize) respectively.