> To archive Metabrainz there is no way but to browse the pages slowly page-by-page. There's...

squigz • yesterday at 10:54 PM • 1 reply • view on HN

> To archive Metabrainz there is no way but to browse the pages slowly page-by-page. There's no machine-communicable way that suggests an alternative.

Why does there have to be a "machine-communicable way"? If these developers cared about such things they would spend 20 seconds looking at this page. It's literally one of the first links when you Google "metabrainz"

https://metabrainz.org/datasets

Replies

what • today at 1:51 AM

You expect the developers of a crawler to look at every site they crawl and develop a specialized crawler for them? That’s fine if you’re only crawling a handful of sites, but absolutely insane if you’re crawling the entire web.

➕ show 1 reply

alt Hacker News

Replies