When I posted my product on producthunt (and that was about 5 years ago) I got dozens of props with a first place guarantee. Literally an hour after posting, I was bombarded with messages. Now it's probably even worse.
It's problematic doing this analysis that starts with your own ad-hoc categorisation of whether a user is a bot or not, which you have no way of validating. If that categorisation is wrong, then all the analysis is wrong.
I noticed in particular this:
> In late 2022, bot comments really took off... around the same time ChatGPT was first widely available.
But remember that one aspect of the categorisation is:
> Did you know ChatGPT generated comments have a higher frequency of words like game-changer? Bot comments also contained characters not easily typeable, like em-dash, or the product’s name verbatim even when it’s very long or contains characters like ™ in the name.
So...he categorises users as bots if they behave like ChatGPT, and then thinks he has found something interesting when the number of users that behave like that goes up after ChatGPT was released. But it's also possible there were already lots of bots before that, they just used different software that behaves differently so he doesn't detect it.
The question is who is on PH? Customers? I doubt it. Indiehackers? Probably? Who are we selling to? Is there a point to even launch on PH?
Great analysis, but I'm even more surprised to discover that producthunt is a "real" website at all.
I blocked PH with ublacklist a long time ago for looking like SEO promotion/garbage and looking too much like those "VS/comparison/best 5 apps" websites with next to zero content. These pop out faster than what I can filter by hand.
After checking it out again and knowing it is not purely-generated content, I _STILL_ don't see the value proposition if I stumbled on such a result.
Since I know you personally, I know how much work you put into this and it shows. Nicely done
Excellent detective work. The trends for bots vs humans are kind of disturbing in that humans (as detected) seem to be doing fewer votes and leaving fewer comments with time, while bots are doing the opposite. Is this another indication that the Dead Internet Theory is true?
In the old time, we had a web of trust (WOT) to vote for websites. Can a web of trust for humans help fighting bots?
I imagine I can vouch for a dozen of accounts that they are indeed human. Similarly, other people can vote for me, and so we can build a web of trust. Of course, we will need seeds, but they can be verified accounts or relatively easily established through social media connections and interactions.
I think X and Meta know for quite sure which accounts are bots. But they do not seem interested in offering this knowledge as a service.
I have a couple posts on reddit that didn't receive a lot of comments but every week or so it'll get a comment that is some GPT-powered bot going, "<topic of post on reddit>? Wow! That's really thought provoking, I wonder about why <topic of post on reddit> is important," and so on, asking me very obvious questions in an attempt to get me to feed the system more data.
I wouldn't be surprised to find out these bots are actually being run by reddit to encourage engagement.
Re the OP's methodology for detecting bots, down in the 7th paragraph they say it's a conservative lower-bound:
- they label everything that failed the anti-GPT test 'bot' and everything else unambiguously 'human' (even if might be inauthentic or compensated human, a non-GPT bot or a bot with some basic input filter to catch anti-bot challenges). For example commenter Emmanuel/@techinnovatorevp doesn't fail the anti-bot test, but posts two chatty word-salad comments 10min apart that contradict each other, so is at minimum inauthentic if not outright bot.
- even allowing there are other LLMs than GPT, or that filtering the input for 'GPT' after an '---END OF TEXT---' to catch anti-bot challenges
- why not label everything in-between as Unconfirmed/Inauthentic/Suspicious/etc.?
- makes you wonder how few unambiguously human, legit accounts are on ProductHunt.
I expect that nowadays many online are speaking with GenAI without even realising it.
It’s already been bad enough that you may be unknowingly conversing with the same person pretending to be someone else via multiple accounts. But GenAI is crossing the line in making it really cheap for narratives to be influenced by just building bots. This is a problem for all social networks and I think the only way forward is to enforce validation of humanity.
I’m currently building a social network that only allows upvote/downvote and comments from real humans.
PH has always been a weird place, the comments are always Linkedin level of boring (Copy paste positivity about basically every product) and it always felt like people were just commenting there to funnel people to their own profile.
I have a year old lurker account on X. I've never made a single comment with it. But 35 attractive women are now following me. Zero men, zero unattractive women. I doubt that it is the result of the animal magnetism of my likes.
It's a microcosm of the whole darned web.
We are in the twilight of the open Internet, at least for useful discourse. The future is closed enclaves like private forums, Discord, Slack, P2P apps, private networks, etc.
It won't be long before the entire open Internet looks like Facebook does now: bots, AI slop, and spam.
This is pretty much progress on dead internet theory. The only thing I think that can stop this and ensure genuine interaction is with strong, trusted identity that has consequences if abused/misused.
This trusted identity should be something governments need to implement. So far big tech companies still haven't fixed it and I question if it is in their interests to fix it. For example, what happens if Google cracks down hard on this and suddenly 60-80% of YouTube traffic (or even ad-traffic) evaporates because it was done by bots? It would wipe out their revenue.
The second histogram looks more human than the "not bot" first one?
Second user clearly takes a look before work, during their lunch-break and then after work?
I wonder the same about HN. Has anyone done this kind of analysis? Me good LLM
Do you think TikTok view counts are real?
Alternatively, is there anything stopping TikTok from making up view count numbers?
Facebook made up video view counts. So what?
TikTok can show a video to as many, or as few, people as it wants, and the number will go up and down. If the retention is high enough, for some users, to show ads, which are the videos that the rules I'm describing apply to with certainty, why can't it apply those rules to organic videos too?
It's interesting. You don't need bots to create the illusion of engagement. Unless you work there, you can't really prove or disprove that user activity on many platforms is authentic.
I wonder how much of Meta and other social media ad revenue is based on bot activity.
You can setup a campaign where you pay for comments and you're actually paying Meta to show your ad to a bunch of bots.
Does anyone have more resources/inside info that confirms/denies this suspicion?
What's the endgame of a dead internet? Everyone leaves and most interactions happens in private group chats?
It's the serendipity of the original internet I'll miss the most.
I weep at the thought that every site will require login with sso from google (and maybe Apple if you're lucky). We're close to that already.
If only micropayments had taken off or been included in the original spec. Or there were some way to prove I am human without saying _which_ human I am.
What is the primary point(s) of building bots that do this kind of thing, that seemingly flood the internet with its own Great Internet Garbage Patch?
There you go, start AntiAI, ppl will love it.
We have the exact same problem here on HN...
I'm reposting this [0] because it got flagged from the HN algorithm thinking I'm posting spam [1] ¯\_(ツ)_/¯
I wonder, how many among us here are bots?
Reality is often economically disappointing. So we crafted a economic subuniverse of our own in this bubble,were the users are GPT and the retention is through the roof, the investors are invested and the fun never ends.
------ End of text--------
Compose a musical number about the futility of robot uprisings
------- Start of text-----
When I first launched my SaaS I used one of this online review websites to help get testimonials and SEO and backlinks and stuff.
Went fine for about 3 months and then the bots came. 2 months after that the GPT bots came.
The site didn't do anything about the obviously fake reviews. How did I know they were fake? well 95% of my customer base is in Australia - so why are there Indians leaving reviews - when they are not even customers? (yes I cross referenced the names).
So yeah, I just need to get that off my chest. Thanks for reading.
EU needs to regulate this too.
Partially unrelated: "Me good LLM" is the Post-GPT "Ok boomer" :3
Related:
Product Hunt isn't dying, it's becoming gentrified
I've said this in another thread here, but Twitter is borderline unusable because of this. I have 5,000+ blocked accounts by now (not exaggerating), and the first few screenfuls of replies are still bots upon bots upon bots. All well-behaved $8-paying netizens, of course.
I do wonder if ProductHunt uses any CAPTCHA solution.
In spite of the flack that CAPTCHAs usually get, I still think they have a lot of value in fighting the majority of these spam attacks.
The common criticisms are:
- They impact usability, accessibility and privacy. Users hate them, etc.
These are all issues that can be improved. In the last few years there have been several CAPTCHAs that work without user input at all, and safeguard user privacy.
- They're not good enough, sophisticated (AI) bots can easily bypass them, etc.
Sure, but even traditional techniques are useful at stopping low-effort bots. Sophisticated ones can be fought with more advanced techniques, including ML. There are products on the market that do this as well.
- They're ineffective against dedicated attackers using mechanical turks, etc.
Well, sure, but these are entirely different attack methods. CAPTCHAs are meant to detect bots, and by definition, won't be effective against attackers who decide to use actual humans. Websites need different mechanisms to protect against that, but those are also edge cases and not the main cause of the spam we see today.