Check out the GLM models, they are excellent
Minimax m2.1 rivals GLM 4.7 and fits in 128GB with 100k context at 3bit quantization.
Minimax m2.1 rivals GLM 4.7 and fits in 128GB with 100k context at 3bit quantization.