> (2 bits per color? how is that possible).
this is probably a rhetorical question, but lemme answer anyways: By packing the colour channels into a single byte. So, for example, you'd have RRGGBBAA within a single byte, for each pixel. Giving you 64 possible colours, with 4 steps of alpha.
Or if you don't need to have alpha, you could pack it even further down to RRGGBB in a byte, which leaves 2 bits left over for the next pixel. Via that, you can pack four pixels worth of colour data into 3 bytes: RRGGBBRR|GGBBRRGG|BBRRGGBB (italics for delineting pixels, vertical bars for delineting bytes)
The latter is a tradeoff between compression and a more complex accessing pattern.
A bit of a tangent, some system used RRRGGGBB for colours, because the eyes are the least sensitive to differentiating the amount of blue, so that's another way to use up a full byte per pixel.
So, first of all, of course, a rhetorical question. Modern interns will probably assume at least 4 bytes per pixel (R, G, B, and A).
But the original post actually talks about CGA [1] with just four colors. Encoding a color needs two bits then, so each byte encodes four pixels.
[1]: https://en.wikipedia.org/wiki/Color_Graphics_Adapter