We use them today - terminal emulators. They intermediate between bytes and pixels.
If colours were delivered via a sideband, you wouldn't have to know whether the other side was a terminal to disable colours. You could send colours to a file and they wouldn't be stored - or would be stored in RTF format, if you were sending to an RTF file.
The design we use on Linux is very "worse is better". Some mechanisms were developed because they could be developed, and those mechanisms, because they were the ones available, were made to fulfil every purpose they could fulfil, and now we're locked into this design for better or worse.
Windows used to have APIs to directly set text colour. You could set the colour to blue and print some text and it would be blue. You could call a function on a console window object to ask how big the console window was, or to change it. This obviously doesn't compose through pipes or ssh, but Windows doesn't have a pipe culture or ssh culture so that was never a design criterion. They've since deprecated that and moved to the worse-is-better escape-code design, in order to increase compatibility with Linux.