As a hacker, I kinda like naom's code. I was had to implement a TC MoE kernel, and stumbled upo...

ahmadyan • yesterday at 11:54 PM • 2 replies • view on HN

As a hacker, I kinda like naom's code. I was had to implement a TC MoE kernel, and stumbled upon his code from [tensor2tensor](https://github.com/tensorflow/tensor2tensor/blob/master/tens...) and i think "alchemy" is justified. Dude writes some beautiful kernels.

He also saw LLM would replace search before anyone else, and that is something to look at the Lamda or GPT-1's output and think: yeah this will answer all of our questions one day.

Replies

jvican • today at 12:20 AM

There's no doubt about Noam's abilities. But I read through that code, and struggle to see its 'magic' or 'alchemy'. Can you elaborate what you find especially good about that code? (You may assume GPU kernel programming knowledge on my end.)

➕ show 2 replies

eli_gottlieb • today at 12:18 AM

Also, evaluating complicated functions with numerical stability and automatic differentiation is hard.

alt Hacker News

Replies