If there's any redundancy in the model that can be compressed (parallel to how RLE is used to compress the static Huffman tree in FLATE) that's possible, but it's not necessary if the model is being trained on the input dynamically, like what Bellard's NNCP does.