Amazingly I just did the same thing! Only with AISHELL. It needs work. I used the encoder from the Meta MMS model.
https://github.com/sequoia-hope/mandarin-practice