logoalt Hacker News

gchamonliveyesterday at 10:09 PM0 repliesview on HN

There's a sub 2k tier with a single 3090 that's also serviceable. Run https://github.com/noonghunna/club-3090 with beellama, fast inference at the cost of a reduced 102k context window