No, LLMs only do this for language. They don't try to do this for arbitrary data.
There are many approaches around this, the simplest being to treat bytes as tokens (cf: Google's ByT5[1]). Also, BLT[2] from Meta and ByteFormer[3] from Apple.
[1]: https://arxiv.org/abs/2105.13626
Transformers do this for any stream of tokens, those tokens can map to anything you want, and you will get lossy compression. Text produced by humans just happens to be dense, available, and a useful prior, but it is not intrinsically required. See 3D vision transformers for example.