logoalt Hacker News

tokioyoyoyesterday at 1:35 AM4 repliesview on HN

Large scale scraping tech is not as sophisticated as you'd think. A significant chunk of it is "get as much as possible, categorize and clean up later". Man, I really want the real web of the 2000s back, when things felt "real" more or less... how can we even get there.


Replies

tmnvixyesterday at 4:24 PM

A curated web directory. Kind of like Yahoo had. The internet according to the dewey system with pages somehow rated for quality by actual humans (maybe something to learn from Wikipedia's approach here?)

n1xis10tyesterday at 1:43 AM

If people start making search engines again and there is more competition for Google, I think things would be pretty sweet.

show 3 replies
thethingundoneyesterday at 1:42 AM

I would understand that, but it seems they don’t store the stuff but recollect the same content every hour.

show 1 reply
idiotsecantyesterday at 5:52 AM

Have you ever listened to the 'high water mark' monologue from fear and loathing? It's pretty much just that. It was a unique time and it was neat that we got to see it, but it can't possibly happen again.

https://www.youtube.com/watch?v=vUgs2O7Okqc

show 1 reply