Interesting! I read the underlying paper, and it looks to me like a variant of log-structured merge (LSM) trees, with at most 1 segment of each power-of-2 size, and all segments stored in a single array W. The 1-bits in the totalSize variable tell you exactly which of these segments are "active" (in use), which is clever. Note that the half-size B array is only used for temporary working space, so it could even be allocated afresh at the start of each modifying operation and thrown away afterwards.
The claim of O(log n) search time isn't quite true -- it holds only for inputs drawn from a uniformly random distribution (Thm 5, p. 9). Many inputs are like this in practice, but many are not, which will cause degradation to O(log^2 n) time -- though this is still very reasonable, and may still beat or be competitive with other truly O(log n) approaches due to lower constant factors (no pointer chasing).
One idea I had to speed up searching if you know you're going to be doing a lot of it with no modifications in between: Since the data structure is agnostic to the actual values within each active segment, whenever there are several adjacent active segments, that entire contiguous block of segments can be sorted without upsetting any invariants. From then until the next modification, instead of separate binary searches on each segment, a single binary search on the entire block can be done. This will halve the total number of binary searches needed in the worst case from log2(n) for 111111... to log2(n)/2 for 101010..., at the cost of these extra sorts -- but note that we don't actually need a full sort, just a (multiway) merge, since the individual segments are already in sorted order. Merging s segments can be done in O(n log s) time with a min-heap, which is not asymptotically better than a full sort (s could be as large as log2(n)) but will be faster in practice due to usually being lower, and lower constants.
In fact, the multiway merge idea could also be used to speed up insertions that trigger more than 1 merge: Instead of a series of s merges, each between a pair of equal-size segments, you could do 1 s-way merge. I would expect this to be faster as it does 1/s as many uncacheable writes (I'm not counting writes to the min-heap itself, which will fit in L1 for all practical n), but only profiling will tell.
The best optimization is to know how big the array will be before you add any elements at all. Use Reserve. Then you completely avoid any intermediate arrays and copying. And in the case of C++, you avoid mass copy construction for classes without move constructors.
You accidentally the word arrays
[flagged]
The memory overhead is fairly significant it uses between 1.5 and 3 times the space of the data stored.