Exactly half of these HN usernames actually exist. So either there are enough people on HN that foll...

icyfox • yesterday at 4:52 PM • 14 replies • view on HN

Exactly half of these HN usernames actually exist. So either there are enough people on HN that follow common conventions for Gemini to guess from a more general distribution, or Gemini has memorized some of the more popular posters. The ones that are missing:

- aphyr_bot - bio_hacker - concerned_grandson - cyborg_sec - dang_fan - edge_compute - founder_jane - glasshole2 - monad_lover - muskwatch - net_hacker - oldtimer99 - persistence_is_key - physics_lover - policy_wonk - pure_coder - qemu_fan - retro_fix - skeptic_ai - stock_watcher

Huge opportunity for someone to become the actual dang fan.

Replies

giancarlostoro • yesterday at 5:10 PM

Before the AI stuff Google had those pop up quick answers when googling. So I googled something like three years ago, saw the answer, realized it was sourced from HN. Clicked the link, and lo and behold, I answered my own question. Look mah! Im on google! So I am not surprised at all that Google crawls HN enough to have it in their LLM.

I did chuckle at the 100% Rust Linux kernel. I like Rust, but that felt like a clever joke by the AI.

➕ show 2 replies

QuantumNomad_ • yesterday at 5:27 PM

ziggy42 is both a submitter of a story on the actual front page at the moment, and also in the AI generated future one.

See other comment where OP shared the prompt. They included a current copy of the front page for context. So it’s not so surprising that ziggy42 for example is in the generated page.

And for other usernames that are real but not currently on the home page, the LLM definitely has plenty occurrences of HN comments and stories in its training data so it’s not really surprising that it is able to include real usernames of people that post a lot. Their names will be occurring over and over in the training data.

➕ show 1 reply

joaogui1 • yesterday at 5:19 PM

HN has been used to train LLMs for a while now, I think it was in the Pile even

➕ show 2 replies

maxglute • yesterday at 9:47 PM

You can straight up ask Google to look for reddit, hackernews users post history. Some of it is probably just via search because it's very recent, as in last few days. Some of the older corpus includes deleted comments so they must be scraping from reddit archive apis too or using that deprecated google history cache.

never_inline • yesterday at 5:56 PM

This is definitely based on a search or page fetch, because there are these which are all today's topics

- IBM to acquire OpenAI (Rumor) (bloomberg.com)

- Jepsen: NATS 4.2 (Still losing messages?) (jepsen.io)

- AI progress is stalling. Human equivalence was a mirage (garymarcus.com)

➕ show 1 reply

vitorgrs • today at 12:15 AM

It does memorize. But that's not actually very news.... I remember ChatGPT 3.5 or old 4.0 to remember some users on some reddit subreddts and all. Saying even the top users for each subreddit..

The thing is, most of the models were heavily post-trained to limit this...

DANmode • yesterday at 7:15 PM

What % of today’s front page submissions are from users that have existed 5-10 years+?

(Especially in datasets before this year?)

I’d bet half or more - but I’m not checking.

atrus • yesterday at 5:07 PM

So many underscores for usernames, and yet, other than a newly created account, there was 1 other username with an underscore.

➕ show 2 replies

hurturue • yesterday at 5:27 PM

either you only notice the xxx_yyy frequent posters or it's quite interesting that so many have this username format

AceJohnny2 • yesterday at 8:00 PM

Aw, I was actually a bit disappointed how much on the nose the usernames were, relative to their postings. Like the "Rust Linux Kernel" by rust_evangelist, "Fixing Lactose Intolerance" by bio_hacker, fixing an 2024 Framework by retro_fix, etc...

skywhopper • yesterday at 6:33 PM

That’s a lot more underscores than the actual distribution (I counted three users with underscores in their usernames among the first five pages of links atm).

dang_fan • yesterday at 4:59 PM

[dead]

bio_hacker • today at 2:08 AM

[dead]

alt Hacker News

Replies