Maybe the article originally featured a 1000-line C implementation.

tadfisher • today at 4:02 AM • 2 replies • view on HN

Replies

I was basing this more on the fact that you don't have to look at C code to understand that non cached transformer inference is going to be super slow.

wasabi991011 • today at 4:10 AM

I don't see how that would be possible given the contents of the article.

➕ show 1 reply

alt Hacker News

Replies