So, for a 3x3 image, the input data would be 9 values like:
R G B
B R G
G B R
?This depends on the camera and the sensor's bayer filter [0]. For example the quad bayer uses a 4x4 like:
G G R R
G G R R
B B G G
B B G G
[0]: https://en.wikipedia.org/wiki/Bayer_filterIn the example ("let's color each pixel ...") the layout is:
R G
G B
Then at a later stage the image is green because "There are twice as many green pixels in the filter matrix".
If you want "3x3 colored image", you would need 6x6 of the bayer filter pixels.
Each RGB pixel would be 2x2 grid of
``` G R B G ```
So G appears twice as many as other colors (this is mostly the same for both the screen and sensor technology).
There are different ways to do the color filter layouts for screens and sensors (Fuji X-Trans have different layout, for example).