That is a fair assessment. Maintaining read/write pos and peek them at every operation is a big performance hit. The impact is amplified if each invocation needs a syscall. That is exactly what futexes address: Allowing spin locks to remain in user space and avoid entering the kernel as long as contention is low.
In JavaScript, atomic operations are relatively lightweight, so their overhead is likely acceptable. Given that, I am open to adjusting my code to your suggested approach and seeing how it performs in practice.