logoalt Hacker News

bearjawsyesterday at 7:40 PM13 repliesview on HN

Feel like the canary was when Grokpedia became a project.

Giant waste of time while Anthropic/OAI keep surging forward.

I also keep hearing this narrative that Twitter is a good data source, but I cannot imagine it's a valuable dataset. Sure keeping up with realtime topics can be useful, but I am not sure how much of a product that is.


Replies

paulbjensenyesterday at 8:59 PM

The Twitter social graph was an amazing data asset. I worked at a consumer insights firm and the data on followers/followings was quite powerful.

Using a custom taxonomy of things (celebrities, influencers, magazines, brands, tv shows, films, games, all kinds of things), we could identify groups of people who liked certain things, and when you looked at what those things were, it gave you a way of understanding who those people were.

With that data, you could work out:

- What celebrities/influencers to use in marketing campaigns - Where to advertise, and on which tv/radio channels - What potential brands to collaborate with to expand your customer base - What tone of voice to use in your advertising - In some cases, we educated clients about who their actual customers were, better than they understood themselves.

One scenario, we built a social media feed based on the things that a group of customers following a well-known Deodorant brand in the UK would see.

When we presented that to the client, they said “Why are there so many women in bikinis in this feed?”

The brand had repositioned themselves to a male-grooming focussed target market, but had failed to realise that their existing customer base were the ones that had been looking at their TV adverts of women on beaches chasing a man who happened to spray their Deodorant on them. Their advertising from the past had been very effective.

That was the power of Twitter’s data, and it is an absolute shame that Twitter went the way that it did. Mark Zuckerberg once said that Twitter was like “watching a clown car driven into a gold mine”.

I’m pretty sure he must be delighted with how things have panned out since.

show 8 replies
brokencodeyesterday at 7:58 PM

It’s pretty telling that Elon had to have Grok rewrite Wikipedia because the truth was too woke for him. No idea how anybody can ever take Grok seriously.

show 7 replies
notahackeryesterday at 7:49 PM

Twitter's communication style being based around brevity, slang, memes, spam and non-threaded conversations seems particularly unlikely to be helpful for optimising LLMs

show 3 replies
UncleOxidantyesterday at 7:58 PM

> Giant waste of time while Anthropic/OAI keep surging forward.

And Google. They're quietly making a lot of progress in the coding space with antigravity and Gemini 3.1.

show 1 reply
jmspringyesterday at 7:43 PM

Twitter has the mass adoption, and it takes an effort to avoid bot/particular view bias - but as a valuable content source, it's a far cry from what it once was before Musk took it over.

ben_wyesterday at 8:23 PM

> Feel like the canary was when Grokpedia became a project. Giant waste of time while Anthropic/OAI keep surging forward.

Really? I assumed that that whole thing was just a very direct `for each article in Wikipedia { article = LLM(systemprompt, article) }`

Agree re Twitter "good" != valuable.

show 1 reply
sheepscreekyesterday at 9:20 PM

AFAIK Grok still doesn’t have a CLI coding agent that works with a subscription. That’s a shame. Grok Code Fast 1 was pretty impressive when it came out - for what it did, and they never followed it up with a new version.

show 1 reply
giancarlostoroyesterday at 8:07 PM

> but I cannot imagine it's a valuable dataset.

It's going to be a mixed batch, but any time there's world events, since as far back as I can think, Twitter (now X) was always first in breaking news. There's plenty of people and news orgs still on X because they need to be for the audience.

samrusyesterday at 9:53 PM

Twotter as a data source is interesting. I think it gets over hyped because thats elons grift. But i cant deny that the real time info aspect of it is pretty valuable. But i definitely think that its not that much more valuable than the open internet from a context source perspective. Everything worthwhile on twitter will end up elsewhere with a bit of lag. And the stuff that wont is noise anyway

laidoffamazonyesterday at 11:37 PM

As someone trying to monitor the situation using Twitter the last few weeks it’s awful and it used to not be!

show 1 reply
BurningFrogyesterday at 8:44 PM

Grok is trained on pretty much the same giant web crawl/text corpus as the other AIs.

vibeprofessoryesterday at 10:09 PM

[dead]

EGregyesterday at 9:00 PM

I'm not a fan of Elon's software endeavors, ever since he bought Twitter and turned it into an even worse cesspool of angry political nonsense than it used to be. I don't like how he's been biasing Grok, etc.

But, what exactly is so bad about Grokipedia? It's a different approach and I think a valid one: trying to do with AI what people have been doing manually at Wikipedia. I'm curious to hear the substantive comparisons.

show 3 replies