archive.today has recently (I noticed this, like, 3 days ago) started automatically making requests to someone's personal blog on their CAPTCHA page. Here's a screenshot of what I'm talking about: https://files.catbox.moe/20jsle.png
The relevant JS is:
setInterval(function() {
fetch("https://gyrovague.com/?s=" + Math.round(new Date().getTime() % 10000000), {
referrerPolicy: "no-referrer",
mode: "no-cors"
});
}, 300);
Looking at this blog, there seems to be exactly one article mentioning archive.today - "archive.today: On the trail of the mysterious guerrilla archivist of the Internet" (https://gyrovague.com/2023/08/05/archive-today-on-the-trail-of-the-mysterious-guerrilla-archivist-of-the-internet/), where the person running the blog digs up some information about archive's owner.So perhaps this is some kind of revenge/DOS attack attempt/deliberately wasting their bandwidth in response to this article? Maybe an attempt to silence them and force to delete their article? But if it is, then I have so many questions. Like, why would the owner of the archive do that 2.5 years after the article was published? Or why would they even do that in the first place, do they not know about Streisand effect?
I'm confused.
This feels like the start of treasure hunt like game. Between username of rabinovich (as others have pointed out) and the prior submission by rabinovich of an archive.today like tool 3 months ago - https://ghostarchive.org/. When you click into the search query examples for ghostarchive such as this one https://ghostarchive.org/search?term=https://docs.google.com. Many of the documents are very weird indeed.
Remember when Archive.is/today used to send Cloudflare DNS users into an endless captcha loop because the creator had some kind of philosophical disagreement with Cloudflare? Not the first time they’ve done something petty like this.
Hmm. If it is an attempt at DDoS attacks, it's probably not very fruitful:
>$ resolvectl query gyrovague.com
gyrovague.com: 192.0.78.25 -- link: eno1
192.0.78.24 -- link: eno1
Viewing the first IP address on https://bgp.he.net/ip/192.0.78.25 shows
AS2635 (https://bgp.he.net/AS2635) is announcing 192.0.78.0/24. AS2635 is owned by https://automattic.com aka wordpress.com. I assume that for a managed environment at their scale, this is just another Wednesday for them.Well that is a very silly way to punish the author of an article you don’t want people to know about.
DDosing but still archiving:
https://archive.is/https://gyrovague.com/2023/08/05/archive-...
OP frames this like they just stumbled across the blog post but they created an account matching the name discussed within it three months ago?
I’m confused.
https://news.ycombinator.com/item?id=45922875
“Behind the complaints: Our investigation into the suspicious pressure on Archive.today”
Given it's set to generate random pages on the site, is there even any possible explanation for this that isn't sketchy?
There's really no interpretation of this which isn't malicious, although, not to defend this behaviour whatsoever, I'm not entirely surprised by it. The only real value of archive.is is its paywall bypassing abilities and, presumably, large swaths of residential proxies that allow it to archive sites that archive.org can't. Only somebody with some degree of lawlessness would operate such a project.
Pretty sure that blog is hosted on Wordpress.com infrastructure so it's not like the blog owner would even notice unless it generates so much traffic that WP itself notices.
That said I don't think there's many non-malicious explanation for this, I would suggest writing to HN and see about blocking submissions from the domain [email protected]
I just tried in my browser (Firefox on Ubuntu) and got the same result. Deeply curious.
Worth blocking the URL for users of that Archive site then, avoid extra burden?
Gyrovague here, author of the targeted blog post:
https://gyrovague.com/2023/08/05/archive-today-on-the-trail-...
In the past week or so, I have received a GDPR takedown attempt of the archive.today blog post (which my hosting provider rightly rejected), a politely worded request to take it down (which was sadly eaten by my spam filter), and now this (thanks to the HN reader who tipped me off).
Given that the proverbial cat has been out of the bag for 2.5 years at this point, I'm genuinely puzzled as to what they're hoping to achieve, but this does not seem like a very good way of going about it.
They might need to tweak a single word. Streisand readers won’t have a clue which.
Save the page now and compare a week later.
And that's how advertising works, folks. If someone wants a website dead, I want to know more about it.
https://news.ycombinator.com/item?id=46628734 makes some good points, it shouldn't have been downvoted do death
[dead]
What my pattern-matching eyes immediately spotted is that the hn username that posted this is rabinovich. The linked article speaks about Masha Rabinovich. Maybe a coincidence.
> in a 2012 F-Secure forum post, a “masharabinovich” complains about “my website http://archive.is/” being blacklisted. They pop up on Wikipedia as well getting told off for adding too many links to archive.is, including a mention that they’re using the Czech ISP fiber.cz