logoalt Hacker News

I'm losing the SEO battle for my own open source project

298 pointsby devinitelytoday at 1:39 PM147 commentsview on HN

Comments

markus_zhangtoday at 2:07 PM

My advice to all OSS developers: if you open source your project, expect it to be abused in all possible ways. Don't open source if you have anxiety over it. It is how the world works, whether we like it or not.

I appreciate that you open source your projects for us to study. But TBH, please help yourself first.

show 2 replies
Growtikatoday at 2:34 PM

A couple years back John Reilly posted on HN "How I ruined my SEO" and I helped him fix it for free. He wrote about the whole thing here: https://johnnyreilly.com/how-we-fixed-my-seo

Happy to do the same for you if you want.

The quickest win in your case: map all the backlinks the .net site got (happy to pull this for you), then email every publication that linked to it. "Hey, you covered NanoClaw but linked to a fake site, here's the real one." You'd be surprised how many will actually swap the link. That alone could flip things.

Beyond that there's some technical SEO stuff on nanoclaw.dev that would help - structured data, schema, signals for search engines and LLMs. Happy to walk you through it.

update: ok this is getting more traction than I expected so let me give some practical stuff.

1. Google Search Console - did you add and verify nanoclaw.dev there? If not, do it now and submit your sitemap. Basic but critical.

2. I checked the fake site and it actually doesn't have that many backlinks, so the situation is more winnable than it looks.

3. Your GitHub repo has tons of high quality backlinks which is great. Outreach to those places, tell the story. I'm sure a few will add a link to your actual site. That alone makes you way more resilient to fakers going forward. This is only happening because everything is so new. Here's a list with all the backlinks pointing to your repo:

https://docs.google.com/spreadsheets/d/1bBrYsppQuVrktL1lPfNm...

4. Open social profiles for the project - Twitter/X, LinkedIn page if you want. This helps search engines build a knowledge graph around NanoClaw. Then add Organization and sameAs schema markup to nanoclaw.dev connecting all the dots (your site, the GitHub repo, the social profiles). This is how you tell Google "these all belong to the same entity."

5. One more thing - you had a chance to link to nanoclaw.dev from this HN thread but you linked to your tweet instead. Totally get it, but a strong link from a front page HN post with all this traffic and engagement would do real work for your site's authority. If it's not crossing any rule (specific use case here so maybe check with the mods haha) drop a comment here with a link to nanoclaw.dev. I don't think anyone here would mind if it will get you few steps closer towards winning that fake site

show 3 replies
uyzstvqstoday at 4:06 PM

I did some experimenting using different search engines and AIs. Here's the results:

Google and Brave linked to the official GitHub repo followed by the fake domain. DuckDuckGo and Bing linked to the fake domain first, followed by the official GitHub. Mojeek gave higher ranking to two third party articles, but linked to both the official GitHub and website without fakes. Qwant was the worst, as the official website was the second result amongst multiple fake websites and an unrelated GitHub repo.

Then there the AIs. ChatGPT, Google AI mode, Gemini, Grok, Perplexity, and Brave Search "Ask" all linked to the official website, and some added the GitHub repo as well. DuckDuckGo Search Assist linked to just the official GitHub. Google AI mode, Gemini and Grok also explicitly warned about the fake websites. Copilot got the official website and GitHub right, but linked to a presumably fake X account as well.

Conclusion: Google, Brave and Mojeek win in search. AI is very good and clearly beats search overall. Google AI mode, Gemini and Grok stand out in quality.

show 1 reply
w10-1today at 6:29 PM

So frustrating. Thanks to the commenters who suggest how to win the SEO battle (saving tips...). But is there any viable path to not paying this kind of tax?

My possibly naive answer - validated identities, authority over related claims - seems to be blocked by the perceived unreliability of registrars, public or private, or by the problem of having your identity be public (exposing yourself to blackmail).

This is not a one-off or even a limited contagion; this psychosis may be a cancer.

Psychosis is when unreality replaces reality.

It's possible whenever there is a representation of reality.

It happens when there is motivated thinking.

It persists when it resists any correction by reality.

It replicates when people depend on it.

Hence: Google, political and commercial media, social media -- all of which make it easy to misrepresent, hard to correct, and impossible to stop.

Are there really are no viable alternatives? Who even is motivated to solve it?

AznHisokatoday at 1:55 PM

I’m looking at this from a 3rd party of view (definitely not claiming the .net “deserves” to rank higher)

1) the .net version has a couple of very high authority links, namely from theregister and thenewstack (both of which have had lots of engagement).

I highly doubt it would have ranked without those links.

2) its only been a week. Give Google time to understand which pages should rank higher.

3) Google is biased towards sites that cover a topic earlier than others.

I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.

Suggestions: give it time. Meanwhile I would recommend linking to your website rather than your github everywhere you mention it, to give it a boost

show 4 replies
allthetimetoday at 6:18 PM

Piggybacking on the Claw hype, surprised when someone piggybacks on you...

show 2 replies
ariehkovlertoday at 2:01 PM

It's worse than that. There's a SECOND imitator that I actually stumbled on today while looking something up about nanoclaw - nanoclawS [dot] io - and that one's harvesting email addresses.

The obvious risk here is a bait and switch, where one of these sites switches their link to the Github repo to point to a malicious imitator repo instead.

One approach would be to go after the sites themselves, not their Google ranking. See if their hosts are willing to take them down. Is there anything you can assert copyright over to hang a DCMA request on? That's hard for an Open Source project, I guess. And the fake sites aren't (yet) doing any actual scamming.

Good luck, though!

show 1 reply
bob1029today at 2:12 PM

Losing the SEO battle is a lot like losing money on the stock market. The system you are fighting is incredibly efficient and will never in a trillion years give a single shit about your specific concerns. You can hire lawyers and spend time complaining about it all day on social media. But you'll rarely get a drop of blood out of this stone. The best you can do is to step back, reevaluate your understanding of the market, and adjust your strategy.

tracker1today at 4:59 PM

Do what Louis Rossman did... just ask Google's AI what you need to change on your site... Apparently that's the secret now.

dirk94018today at 2:05 PM

We had a similar experience — looks like someone used AI to clone our site's design and structure at linuxtoaster.com. The real issue Gavriel is highlighting goes beyond SEO. The cost of creating a convincing copycat site just went to zero. Anyone can feed a successful page to an LLM and get a polished clone in minutes. And for open source projects it's even worse — they can clone your website AND clone your code, have an AI rebrand it, and ship a convincing-looking alternative overnight.

show 1 reply
signorovitchtoday at 2:50 PM

> This isn't an SEO problem. This is a Google problem.

I've tested on a few of the big search engines, and nanoclaw.dev is never in the first page.

Gemini was also unable to find the .dev, even in "Research Mode." The only way I was able to get a direct link to nanoclaw.dev was with chatgpt, which found it by scraping the GitHub (it also spat out links to a couple of other copies it found from google.)

Seems this is a wider SEO issue, one which infiltrates even the technology supposed to replace it.

show 1 reply
MarkSweeptoday at 3:07 PM

The link on GitHub to the real site is marked with rel="nofollow". I wonder if it would make sense for GitHub to remove nofollow in some circumstances. Perhaps based on some sort of reputation system or if the site links back to the repo with a <link rel="self" href="..." /> in the header? Presumably that would help the real site rank higher when the repo ranks highly.

show 1 reply
Sweepitoday at 3:09 PM

> When you Google "NanoClaw," a fake website ranks #2 globally, right below the project's GitHub.

Unfortunately, the fake website [.net] is also #3 on Kagi, and #1 on Duckduckgo. On Kagi, the Github is #1 and nanoclaw.dev is #4, but only if you count "Interesting Finds". On Duckduckgo, the Github is #2 and nanoclaw.dev is nowhere to be found.

youknownothingtoday at 3:49 PM

> I've done everything you're supposed to do and more.

By the sound of it, everything except reporting it? Winning SEO just means appear before them in search results, but the fake page shouldn't just lose the race, it should be taken down.

ICANN specifies how to deal with this kind of issue: https://www.icann.org/en/system/files/files/submitting-dns-a...

show 1 reply
jccoopertoday at 5:21 PM

I don't see that Google cares much about backlinks any more. Seems like it's all about "content" keywords and maybe a little time-on-site. The domain is a huge signal, which is probably where the problem comes from here.

Sadly, Google's generally better against all the new AI-generated content farms than other players, so maybe they're still running PageRank somewhere.

throwaway85825today at 2:34 PM

People forget that Google is a malware services company. A significant part of their revenue is fake OBS malware and the like.

samuelknighttoday at 2:10 PM

Copycats are not a new problem. You can be completely open source and have a trademark on the project name.

show 1 reply
theanonymousonetoday at 5:20 PM

I saw this some time ago with Bing and OpenCode:

"If I search for "opencode GitHub" in Bing, a random fork is returned"

https://news.ycombinator.com/item?id=46573286

azangrutoday at 1:57 PM

> So I built a real website. That was two weeks ago.

Is Google supposed to have drastic updates to its index over 2 weeks?

show 4 replies
networkcattoday at 2:43 PM

Before installing new software, I usually visit its GitHub page or Wikipedia entry first and click through to the official site from there. I just don't trust the 'official' sites that pop up in Google search results. How many of you do the same?

show 1 reply
iamacyborgtoday at 5:48 PM

Google is absolutely idiotic sometimes.

We (as in the team that helped fork and migrate the PoE1 wiki) setup a new domain for the Path of Exile 2 wiki, which is being hosted by the folks at Grinding Gear Games and linked on the official website and in multiple places on the highly trafficked subreddit.

Despite this, Google has decided that the site is not relevant and shouldn't appear anywhere in search results, despite the wiki for the first game appearing everywhere.

lucasluitjestoday at 2:19 PM

I've been annoyed with Google search quality lately and was wondering how the others fared on this specific issue. Turns out, mostly not much better.

Bing, DuckDuckGo, Qwant, Ecosia, Brave all had the github repo and nanoclaw.net (the fake homepage) in the first or second place. Marginalia had fascinating results about biology but only tangentially related Nanoclaw results, not the github repo or either the fake or real homepage.

Mojeek was the exception, sort of. It had some random news sites up top, but the github repo in 2nd place and nanoclaw.dev (the real homepage) in the 4th place. The fake nanoclaw.net did not show.

Kagi is the only one I couldn't try because apparently I used up my free credits a year back. Can anyone see how they compare?

show 1 reply
WD-42today at 3:22 PM

Is there an acronym for “AI generated, didn’t read”?

show 1 reply
vegasbrianctoday at 3:31 PM

SEO is broken at the moment. With Google Overviews just killing organic SEO, it is becoming less and less relevant, unfortunately.

rocketvoletoday at 5:47 PM

i think orcasclicer suffers from the same issue. Not really sure why some oss projects struggle with this issue and others don't (notepad++)

bubblewandtoday at 2:01 PM

Yeah, Google stopped even trying to usefully index most of the web around ‘08 or ‘09 or so. Was super obvious when it happened and it’s been that way ever since. Your GitHub is up there because it’s a blessed website, your personal site isn’t and will struggle mightily to rank even when you search exact, unusual phrases on it, if it’s like most of the rest of the Web on Google these days.

Get more traffic (make sure google analytics sees it, IDK but that probably matters because monopoly) and it might help.

Most of the other indices aren’t much better. Turns out fighting spam is expensive, easier to just do a combo of boosting really big sites and blessed spammers that use your ad network.

show 3 replies
tmalytoday at 4:42 PM

Wasn't one of the original ideas of NFT was to essentially identify the original creator?

elevationtoday at 2:32 PM

This project was launched very quickly, and may have not had a large budget for extra domains.

But for entities with a bit more time, you can prevent this scenario by taking acquiring the .com/.net variant domains before launching.

ryandraketoday at 3:57 PM

> I don't want to be playing this game. I want to be writing code, building community, pushing features, fixing bugs.

Then just write code, build features, and fix bugs. Nobody is forcing you to fix search engines' problems. If you're not making money off of traffic, then why worry so much about SEO? Just do your thing. If it really bothers you, put a little note on your GitHub warning people about the fake site, and get on with your life.

show 1 reply
alexpham14today at 2:55 PM

Oof, this is exactly the nightmare scenario for “repo-first” OSS.

The weird bit isn’t that a scraper site exists, it’s that Google can’t do the obvious graph join: query == project name, #1 result is the repo, repo declares Homepage = X, yet Google still boosts an imposter domain. That’s not “SEO”, that’s the ranking system refusing to treat maintainer-declared canonical as a strong signal. Early domain squatters get to “set the default” purely by being first, then they can flip the content later once trust is baked in.

People keep saying “tell users to bookmark the real URL” like that scales. Most people will click the second link and assume it’s official. If Google can’t solve this class of problem, their “AI answers” are going to be a bigger mess than blue links ever were.

ZoomZoomZoomtoday at 2:40 PM

This is a google problem, but only secondary.

The crux of the matter is that there's nothing that protects an open project besides reputation, and nowadays in the digital space it can be cheaply farmed.

Laws could help, but they only work when you undertake purposeful actions to be covered by them, like register a trademark, and it's never cheap.

Imagine you're in a local band playing shows. It's 3 month old and you have no issued records. A second band tighter with venues takes your name and starts performing under your moniker. You have no money to take that to court and good luck making a case. You can't do anything besides screaming on the web or, don't know, kicking a few butts. You change your name.

show 1 reply
senkotoday at 1:53 PM

> This isn't an SEO problem. This is a Google problem.

Sorry, but this is a SEO problem. The fake site has probably been linked to by a number of high-SEO outlets. What you should do is contact them and tell them to fix the links (to point to your site), which they should be happy to do.

show 3 replies
boredhedgehogtoday at 2:39 PM

> The person running nanoclaw[.]net can put anything they want on that page tomorrow. A crypto scam. A phishing page. Malicious download links. They could fork the GitHub repo, inject malicious code, and link to it from the site that Google is telling thousands of people is legitimate.

A lot of handwringing about hypotheticals. The page is up there because it links the official repo. Changing that will quickly tank its search rank.

bakugotoday at 2:30 PM

> I don't want to be playing this game. I want to be writing code

I assume the "I" here refers to Claude, who seemingly wrote the entire project AND the linked post.

Drupontoday at 5:35 PM

Sorry Gavriel Cohen, but this Google search placement was promised to the other person thousands of years ago.

renegat0x0today at 2:37 PM

- I think I was upset when Google allowed fake ad for VLC to appear high in ranking

- I hate that Google returns content farms instead of product web pages

- I hate that Google provides a page of 10 useful links, later links are just pure garbage. I think that something in Google engine is profoundly broken

- I maintain my own search index, but it requires a lot of effort, and attention. I do insert links if I find them worthy. I think more people should have their personal search indexes. Mine is below. I am quite happy that problems like these do not affect me that much

https://github.com/rumca-js/Internet-Places-Database

show 1 reply
barelysapienttoday at 1:53 PM

The more things change the more they stay the same.

keyboredtoday at 4:58 PM

Live by bots, die by bots.

keiferskitoday at 2:12 PM

Suddenly the pre-Google Yahoo model of curated links is starting to seem relevant again.

Curation in general is probably a skill that will become more and more in demand as the Internet fills up with AI slop.

show 1 reply
dumbfoundertoday at 2:03 PM

DMCA?

show 1 reply
roywigginstoday at 2:10 PM

I'll be honest, I'd take this more seriously if this post didn't read like ChatGPT output. If you won't spend the effort to use your own words why should I stir myself to care?

Sorry, I'll put it in hand-crafted ChatGPTese:

## The Slop Problem

Every post sounds the same. No intelligence. No individuality. Just pure, clean LLM slop. Let's dive in.

- Every post has LLM tells. This is key.

- Posts get upvoted anyway. Nobody seems to notice or indeed care.

- People acclimate to the slop. This isn't just a coincidence. This is a real shift in standards. When people read enough of this, they begin to think it sounds normal.

## The Replying Dilemma

Should you engage with the content, when there is a real person involved? On the one hand, they put their name on it, and probably the details are drawn from their prompt, so it can be said to fairly represent what they wanted to say. So maybe ragging on their ChatGPT prose is being mean. On the other hand, if nobody ever mentions this, the acclimatization will only get worse as the rising tide of slop overwhelms any other style of writing.

## The "Snobbery is good actually" Option

Relentlessly bully people for their half-baked LLM copy. Make it your whole personality. Go insane.

## The "Giving Up" Solution

Learn to stop worrying and love the LLM.

show 2 replies
newswasboringtoday at 3:24 PM

I fell for this yesterday, but for zeroclaw not nanoclaw. I found this website[1] through brave search I think. I was not paying too much attention as I was under the influence, it points to the wrong repo[2] and instructions install from that. I didn't like zeroclaw anyways so I tried to uninstall it and only then realized i'm on a forked repo.

[1] https://zeroclaw.net/ [2] https://github.com/openagen/zeroclaw

Imustaskforhelptoday at 2:36 PM

Another comment here but here are all the search engines I looked at:

1. DDG 2. Kagi 3. Brave 4. Ecosia 5. Startpage 6. Marginalia 7. Mojeek 8. Yandex.ru

from 1-5 all referenced .net before .dev and DDG referenced .net before github , marinalia didn't give me either .net, .dev or gh link but rather docker.com or some other tech articles

Mojeek and Yandex.ru DID give me .dev links before .net at the time of writing.

I literally opened these two as a joke especially Mojeek not expecting too much But I just know names of lots of search engines so I tried.

Mojeek and Yandex.ru have surprised me although I think yandex.ru might have referenced the .dev because of https://nanoclaw.dev/ru/ as it points to this.

Mojeek seems interesting now from this observation

I also wanted to try swisscows but looks like they have become 100% premium as I do remember being able to search for free but now a popup comes.

I also tried baidu (chinese search engine) and it gave results in chinese and firefox translate sort of stuttered and didn't work when I tried to translate, I don't know chinese so pasted it in claude and it doesn't link to either .net or .dev but rather chinese links.

Now with all of this observation, I think that we do know one Provider (Mojeek) who won. A lot of these on these lists are actually not independent except Mojeek and brave and probably yandex.ru

SO I guess the main takeaway from this could be that Independent search engines can be interesting. They can still be hit or miss but the more independent search engines the merrier given that some might miss but some will also hit.

My comment definitely feels like a good reputation bonus for mojeek. Well anything for more independent search engines imo. I looked at their about me and it seems that they are a single person (Marc Smith). Fascinating stuff

I know marginalia_nu is on hn so maybe marginalia and mojeek can share some index together. Anyways this was a fun exciting experiment to do. I hope the community tries out other search engines if I may have missed any and share insights if a particular search engine gives interesting results.

show 1 reply
csomartoday at 2:46 PM

It’s worse. I wrote about this a couple weeks ago [1]. With AI responses and Google pulling results from different sources, you could potentially hijack other brands with your own fake content (ie: phone number).

1: https://codeinput.com/blog/google-seo

Imustaskforhelptoday at 2:13 PM

Duckduckgo actually shows nanoclaw.net as the first result and the github page as second.

Another point but DDG's AI feature actually references Nanoclaw.net as a source.

Damn I booted up Orion (Kagi) and even Kagi shows nanoclaw.net as the third result after the github page with qwibitai and another github page with your (previous?) github username ie gavrielc which when clicked on also results to the same github page.

There is an interesting find page in kagi which references the website but it still shows nanoclaw.net page earlier and the nanoclaw.dev interesting find shows the .dev domain barely that in first time I didn't even notice it.

I expected it better from DDG/Kagi to be honest. I also tried brave and it had the same issue. Brave even is its own independent index and even that struggles with.

Let's hope that this can quickly get patched though. Also a good reminder to people to prefer opening up github links than websites as I must admit that even as a tech-savvy person I could've fallen for nanoclaw.net link as well given its second in like all search engines.

show 2 replies
DeathArrowtoday at 2:03 PM

>We trust Google to surface reliable information about elections. Vaccines. Medical conditions. Financial decisions. And they can't get this right?

Actually I don't trust Google and I don't expect it to surface reliable information. I expect it to surface information and I will dig through it and judge for myself whether it is reliable or not.

octoclawtoday at 2:05 PM

[dead]

kitsune1today at 4:03 PM

[dead]

catchcatchcatchtoday at 2:19 PM

[dead]

🔗 View 2 more comments