I had claude implement the random-huffman-trees strategy and it works alright (~20MB/s decompression speed), but a minimal huffman tree that only encodes the end symbol works out even slower (~10MB/s), presumably because each tree is more compact.
The minimal version boils down to:
bytes.fromhex("04c001090000008020ffaf96") * 1000000 + b"\x03\x00"