Note that by default rustc targets x86-64-v1 when compiling for x86-64, and that lacks the popcount instruction. You need to change the target_cpu to at least x86-64-v2 or enable the popcount target_feature. This means that even if your cpu is relatively new and you intend to run your code on relatively new cpus, rustc will still generate older and slower code for count_ones() using bitshifts and masks. That said, I don't see the point in writing them manually if the compiler can generate them for you.
It's not unreasonable to think that Rust will change the minimum version and you should always override the target cpu anyway for C++-like toolchains when building production binaries (`-Ctarget-cpu` for rust, `march=` for clang/gcc).