That’s pretty awesome!
Though only 5gig Ethernet? Can’t they do usb-c / thunderbolt 40 Gb/s connections like Macs?
I set up ollama today and can barely run a 3b parameter model before the lag makes it unbearable.
How much is one of these gonna run me?
The setup was around $10k, but maybe more now with mem/ssd prices.
This is a good list, I like my Beelink a lot, my Minisforum likes to turn itself off every couple of weeks, not sure why yet.
https://www.techradar.com/pro/there-are-15-amd-ryzen-ai-max-...
---
Performance is pretty bad (<10/tps) and context is quite limited. Still good to see progress
Prompt Size (tokens) | TFT (s) - Flash Attention Disabled | TFT (s) - Flash Attention Enabled
4096 | 53.7s | 39.7s
8192 | Out Of Memory (OOM) | 90.5s
16384 | Out Of Memory (OOM) | 239.1s
Framework has gone fully in the tank of Apple consumerization route of unrepairability and unupgradeability with a nonstandard machine, soldered-on RAM, and no meaningful PCIe slots. There's only the superficial appearance of longevity and future-proofness when it's really yet another silo. There's no way to add an IB, FC, or 100/400 GbE NICs to these machines. 5 GbE is a joke. Non-ECC RAM is a joke.
Cool that it's possible but basically unusable performance characteristics. For an 8192 token prompt they report a ~1.5 minute time-to-first-token and then 8.30tk/s from there. For context ChatGPT is typically <<1s ttft and ~50tk/s.