Pretty interesting. The posterior matching is a big deal, but I'm not convinced by the handwaiving required to demonstrate it in larger models. I'm interested in seeing how direct EM training scales though.