> we’re not there yet, in part because of how much more powerful connected frontier models are ...

mock-possum • today at 7:04 AM • 2 replies • view on HN

> we’re not there yet, in part because of how much more powerful connected frontier models are

Is that why though? You need a beast of a machine to run a functional local model in my experience.

I think the big part is there’s significant sticker shock to buying capable hardware.

That said,

> weekend. I chose to try fine-tuning on two models, Llama 3.1 8B Instruct and Qwen 2.5 7B Instruct. At their size (around 8B) they run comfortably on a MacBook Air

Perhaps I spoke too soon?

Anyway

> I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs

this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It’s a cool article though, so maybe that’s a forgivably clickbaity title.

Replies

mschild • today at 7:16 AM

Running models locally is surprisingly easy and possible even on older hardware.

Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines.

I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results.

➕ show 1 reply

OJFord • today at 7:19 AM

> this strikes me as a very specific brand of 1995’s prose, spanning about 30 years.

It's probably a fair approach to say the significant influence (training dataset) on writing at a particular time is the preceeding 30 years' material? It's certainly not only what's already written that year (nor anything since).

➕ show 1 reply

alt Hacker News

Replies