logoalt Hacker News

reissetoday at 12:42 AM1 replyview on HN

Nothing special?

I mean, inference engine might need to get some tweaks, to support whatever compute is available. But then, if you put a few terabytes of disk for swap, and replace RAM to bigger sticks if possible, it should work? Slowly, of course, but there is no reason it should not to.


Replies

reverius42today at 1:17 AM

The big difference will be measuring seconds per token instead of tokens per second.