That's a good example! So the sqrt(n) space in the result must refer to space beyond that needed to write down the output--basically a working memory scratchpad. In which case, a string-copying function could use just a constant amount of scratch space.
I mean, for all I know, the result may have been proved in the model of decision problems where the output is always one bit. But I'd guess that it generalizes just fine to the case where you have to write longer outputs.
However, your question does make me wonder if the result still applies in the RAM model, where you can move to any spot on the tape in constant time. My theory knowledge is getting rusty...
Update after youtube: It was already known that a time-t function can be computed in sqrt(t) space on a single-tape Turing machine. It's exactly like @cma said: a Turing machine wastes a lot of time moving the head back and forth. Williams' result shows sqrt(t) space for multi-tape Turing machines. Those are much more time-efficient in regard to head movement, so until Williams' proof it was not believed to be possible to to make any algorithm use only sqrt(t) space.