Oh, that is clever!
I'd also suspect that there are networks / links which are more likely signs of low-value content than others. Off the top of my head, crypto, MLM, known scam/fraud sites, and perhaps share links to certain social networks might be negative indicators.
You can actually identify clusters of websites based on the cosine similarity of their outbound links. Pretty useful for identifying content farms spanning multiple websites.
Have a lil' data explorer for this: https://explore2.marginalia.nu/
Quite a lot of dead links in the dataset, but it's still useful.