> we’re not there yet, in part because of how much more powerful connected frontier models are
Is that why though? You need a beast of a machine to run a functional local model in my experience.
I think the big part is there’s significant sticker shock to buying capable hardware.
That said,
> weekend. I chose to try fine-tuning on two models, Llama 3.1 8B Instruct and Qwen 2.5 7B Instruct. At their size (around 8B) they run comfortably on a MacBook Air
Perhaps I spoke too soon?
Anyway
> I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs
this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It’s a cool article though, so maybe that’s a forgivably clickbaity title.
> this strikes me as a very specific brand of 1995’s prose, spanning about 30 years.
It's probably a fair approach to say the significant influence (training dataset) on writing at a particular time is the preceeding 30 years' material? It's certainly not only what's already written that year (nor anything since).
Running models locally is surprisingly easy and possible even on older hardware.
Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines.
I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results.