Blas and faer are used only for small corners of the API (linalg and fft) which is exactly what numpy does. I encourage you to follow your own advice and look more closely at the interaction of ufuncs, strides, and dtypes.