I'd rather see them use AI to convert all the scanned scientific articles into proper PDF or other formats.
Also sort and classify the articles by binary size, vs page count, plot count, raster image count etc, in order to compress the outliers and detect when a raster image should have been a plot and convert it to vectorized images etc.
How compact can we get the collective human scientific corpus?