logoalt Hacker News

wmftoday at 5:13 PM1 replyview on HN

That just sounds like a 3090.


Replies

cyanydeeztoday at 7:01 PM

not at the vram sizes that control how much context to load; also, GPUs arn't as effiecient as direct inference.

show 1 reply