Could be amazing, but it's hard to judge if it will really work with say a 27 B model or larger. We can already get pretty good speed with a 2B model.
thanks! we explain how it scales to larger models in the last section the OP blog post
thanks! we explain how it scales to larger models in the last section the OP blog post