Sadly paper is behind paywall. But I question the choosing of the water model to be a 4-site, and why that specific 4-site one (TIP4P) instead of others that have shown to be more accurate such as OPC. Also, there seem to be previous experimental work (https://arxiv.org/abs/1304.2877) showing some evidence that apparently is not even referenced in this new paper. I wonder how does that compare, if at all.
Your linked preprint has been also published in Nature Communications: https://www.nature.com/articles/ncomms3401