I really want to know what does M, K, XL XS mean in this context and how to choose.
I searched all unsloth doc and there seems no explaination at all.
They are different quantization types, you can read more here https://huggingface.co/docs/hub/gguf#quantization-types
Just start with q4_k_m and figure out the rest later.
Q4_K is a type of quantization. It means that all weights will be at a minimum 4bits using the K method.
But if you're willing to give more bits to only certain important weights, you get to preserve a lot more quality for not that much more space.
The S/M/L/XL is what tells you how many tensors get to use more bits.
The difference between S and M is generally noticeable (on benchmarks). The difference between M and L/XL is less so, let alone in real use (ymmv).
Here's an example of the contents of a Q4_K_: