Train your LM from scratch* I doubt you have a machine big enough to make it "Large".

baalimago • today at 5:28 AM • 2 replies • view on HN

Train your LM from scratch*

I doubt you have a machine big enough to make it "Large".

Replies

mips_avatar • today at 6:06 AM

You can fully train a 1.6b model on a single 3090. That’s a reasonably big model.

➕ show 1 reply

nucleardog • today at 5:52 AM

Hey now! I've got a half terabyte of RAM at my disposal! I mean, it's DDR4 but... it's RAM!

And it's paired with 48 processor cores! I mean, they don't even support AVX512 but they can do math!

I could totally train a LLM! Or at least my family could... might need my kid to pick up and carry on the project.

But in all seriousness... you either missed the point, are being needlessly pedantic, or are... wrong?

This is about learning concepts, and the rest of this is mostly moot.

On the pedantic or wrong notes--What is the documented cut-off for a "large" language model? Because GPT-2 was and is described as a "large" language model. It had 1.5B parameters. You can just about get a consumer GPU capable of training that for about $400 these days.

➕ show 2 replies

alt Hacker News

Replies