If you want the write to be a synchronization point after which all threads will observe only the new value, it's only possible with shared mutex. Of course you can use a barrier to accomplish that instead but using something like hazard pointers or rcu doesn't synchronize by itself.
Not an expert, but can’t you get synchronization like this just by using release/acquire memory order with C11 atomic stores and loads?
I think it's possible with an atomic<shared_ptr> too (C++20)?
A shared_mutex comes in useful when you can't really have multiple copies of the shared data due to perhaps memory usage, so readers fail to acquire it when the writer is updating it.
This is true, but it is a subset of designs using shared_mutex.