I run it with Llama.cpp on my RTX 3090. Also using the same Unsloth model. My config is similar to...

jyap • today at 4:24 PM • 0 replies • view on HN

I run it with Llama.cpp on my RTX 3090. Also using the same Unsloth model.

I need to try out some of the other set ups mentioned in this repo for increased TPS.

alt Hacker News