The linked paper tested nanoGPT with this new transformer:
https://www.techrxiv.org/users/685780/articles/1375955-topol...
thanks for linking.
Yes the paper compares the new architecture (that is also a fork of my implementation of nanoGPT) with Karpathy's nanoGPT. There are also links to the code and bench used.
thanks for linking.
Yes the paper compares the new architecture (that is also a fork of my implementation of nanoGPT) with Karpathy's nanoGPT. There are also links to the code and bench used.