The site says 14x less memory usage. I'm a bit confused about that situation. The model file is indeed very small, but on my machine it used roughly the same RAM as 4 bit quants (on CPU).
Though I couldn't get actual English output from it, so maybe something went wrong while running it.