The Q4 quantization requires about 600GB of RAM without context, not exactly consumer hardware friendly.