logoalt Hacker News

hungryhobbityesterday at 5:08 PM10 repliesview on HN

They're a wiki. Wiki spammers are relentless now.

Source: a small wiki I help manage, for an obscure game with <10k players, recently had to disable new signups, because the spam was so bad (and it was stuck on an old version of MediaWiki, which didn't have CAPTCHA-support).

On a popular wiki, and it sounds like this one was fairly popular, I imagine even CAPTCHA's won't be enough to stop wiki spammers. If those spammers were posting more than just "buy my penis pill" garbage (e.g. they were putting links to malware sites), Google probably, and somewhat legitimately, saw them as a source of such malware.

I imagine the fix for the OP is a thorough audit/cleansing of all malicious content on the wiki, followed by some sort of appeal to Google (which will no doubt take months, if they even respond at all, because ... Google).

Really OP's only hope is that the Google team responsible for this has an Italian Pokemon fan; otherwise they are probably screwed.


Replies

zeitg3istyesterday at 5:15 PM

We have very good anti-bot system set up with a good number of Cloudflare fine-tuned rules, limited permissions for newly created accounts, and a very dedicated team of volunteers that patrol the recent edits constantly. I cannot exclude that somewhere on a rarely visited page (out of 37k+) there is a spam link, but I doubt it’s the reason for the deindexing. I think this would also appear on the Google Search Console.

show 1 reply
SXXyesterday at 7:11 PM

If your project is popular enough to the point where tailored automation make sense there no way to fight spam really.

If its small enough you can usually avoid all the spam bots by adding any none-standard flow in registration procedure. E.g static picture or audio of something only your audience know with like drop down option to click on picture saying "I'm not a bot". Or add one more email verification for first post or edits. Or make users watch large YouTube video at certain timespamt with correct answer, etc. Anything non-standard works.

Breaks 99.9% of automation and SERP spammers wont bother create unique one for your wiki / forum / etc.

If your site is very popular you're fckd obviously and it's just arm race. This is where you can use Hashcash or something that will burn lots of CPU / GPU / RAM / etc single time so spammers will just blacklist you.

650REDHAIRyesterday at 8:20 PM

I saw a comment on here a few days ago and the user mentioned that they use a Captcha AI bot in their day to day life because a solve costs $.003. So even if you had the captcha-enabled new version it might not have helped!

dhosekyesterday at 7:28 PM

Captcha does nothing against the spammers. I have found that blocking email domains from signups works pretty well. My list is at https://www.rejectionwiki.com/index.php?title=MediaWiki:Emai... (this is a built-in feature of Media Wiki and should work ok with most versions)

anigbrowlyesterday at 6:00 PM

Do you have any basis for saying that this wiki is overrun with spam, or are you just hand-waving? They were explicit in their Twitter thread about not being full of AI slop, and that they checked their list of pages that were marked as 'crawled but not indexed' and found no abuse.

I understand that you were taken aback by spam attacks on the wiki you help manage, but it's not reasonable to generalize from yours to theirs.

show 2 replies
andrepdyesterday at 5:53 PM

Weird Gloop (wiki host, started with runescape but now has dozens) has blogged about this https://weirdgloop.org/blog/clankers

danarisyesterday at 7:02 PM

How old a version? I've been running a much more obscure game (<150 players, down from ~1k in 2010) for some time, and it was using QuestyCaptcha back in...2008 or so, I think? Certainly at least 15 years ago. It's almost always been sufficient: just put in a couple of questions based on knowledge of the game itself.

show 1 reply
teaearlgraycoldyesterday at 5:29 PM

An organization I'm involved with has had to add Anubis (https://github.com/TecharoHQ/anubis) because of the recent wiki attacks from LLM scrapers. It's finally fixed our outages.

righthandyesterday at 5:10 PM

Social sites should have all have a tree-based invite system. This would allow wiping out spammers and their enablers in a single hit. It would allow vetting of good actors too.

show 8 replies
bogotayesterday at 5:14 PM

[dead]