logoalt Hacker News

Train Your Own LLM from Scratch

54 pointsby kristianpaultoday at 4:09 AM7 commentsview on HN

Comments

jvicantoday at 5:22 AM

If you're interested in this resource, I highly recommend checking out Stanford's CS336 class. It covers all this curriculum in a lot more depth, introduces you into a lot of theoretical aspects (scaling laws, intuitions) and systems thinking (kernel optimization/profiling). For this, you have to do the assignments, of course... https://cs336.stanford.edu/

show 1 reply
hiroakiaizawatoday at 5:48 AM

Nice. What scale does this realistically reach on a single machine?

show 1 reply
baalimagotoday at 5:28 AM

Train your LM from scratch*

I doubt you have a machine big enough to make it "Large".

show 2 replies
iamnotarobotmantoday at 4:55 AM

This looks great for a first introduction to training LLMs, and it looks simple enough to try this locally. Great job!