There are new things being tested and yielding results monthly in modelling. We've deviated quite a bit from the original multi head attention.