Scaling laws assume the error metric and data distribution.
There is a lot of follow on work that explains what happens as you change them, e.g. Scaling Laws for Transfer - https://arxiv.org/pdf/2102.01293
I think it’s fortunate that transfer works in a similar way.
Common crawl (and Reddit, stack overflow, etc but not 4chan) was much easier to get access to at the time than using mechanical Turk.
There is certainly room for more work. There were many papers on scaling laws in NeurIPS this year.