Some comments: - This is a really remarkable graph. I just didn't realize how thoroughly it w...

johnfn • yesterday at 12:39 AM • 37 replies • view on HN

Some comments:

- This is a really remarkable graph. I just didn't realize how thoroughly it was over for SO. It stuns me as much as when Encyclopædia Britannica stopped selling print versions a mere 9 years after the publication of Wikipedia, but at an even faster timescale.

- I disagree with most comments that the brusque moderation is the cause of SO's problems, though it certainly didn't help. SO has had poor moderation from the beginning. The fundamental value proposition of SO is getting an answer to a question; if you can the same answer faster, you don't need SO. I suspect that the gradual decline, beginning around 2016, is due to growth in a number of other sources of answers. Reddit is kind of a dark horse here, as I began seeing answers on Google to more modern technical questions link to a Reddit thread frequently along with SO from 2016 onwards. I also suspect Discord played a part, though this is harder to gauge; I certainly got a number of answers to questions for, e.g., Bun, by asking around in the Bun Discord, etc. The final nail in the coffin is of course LLMs, which can offer a SO-level answer to a decent percentage of questions instantly. (The fact that the LLM doesn't insult you is just the cherry on top.)

- I know I'm beating a dead horse here, but what happens now? Despite stratification I mentioned above, SO was by far the leading source of high quality answers to technical questions. What do LLMs train off of now? I wonder if, 10 years from now, LLMs will still be answering questions that were answered in the halcyon 2014-2020 days of SO better than anything that came after? Or will we find new, better ways to find answers to technical questions?

Replies

Aurornis • yesterday at 2:34 AM

> I disagree with most comments that the brusque moderation is the cause of SO's problems, though it certainly didn't help. SO has had poor moderation from the beginning.

I was an early SO user and I don’t agree with this.

The moderation was always there, but from my perspective it wasn’t until the site really pushed into branching out and expanding Stack Exchange across many topics to become a Quora style competitor that the moderation started taking on a life of its own. Stack Overflow moderator drama felt constant in the later 2010s with endless weird drama spilling across Twitter, Reddit, and the moderator’s personal blogs. That’s about the same time period where it felt like the moderation team was more interested in finding reasons to exercise their moderation power than in maintaining an interesting website.

Since about 2020 every time I click a Stack Overflow link I estimate there’s a 50/50 chance that the question I clicked on would be marked as off topic or closed or something before anyone could answer it. Between the moderator drama and the constant bait-and-switch feeling of clicking on SO links that didn’t go anywhere the site just felt more exhausting than helpful.

josephg • yesterday at 1:49 AM

> The fundamental value proposition of SO is getting an answer to a question

I read an interview once with one of the founders of SO. They said the main value stackoverflow provided wasn't to the person who asked the question. It was for the person who googled it later and found the answer. This is why all the moderation pushes toward deleting duplicates of questions, and having a single accepted answer. They were primarily trying to make google searches more effective for the broader internet. Not provide a service for the question-asker or answerer.

Sad now though, since LLMs have eaten this pie.

➕ show 5 replies

omneity • yesterday at 12:50 AM

Thinking from first principles, a large part of the content on stack overflow comes from the practical experience and battle scars worn by developers sharing them with others and cross-curating approaches.

Privacy concerns notwithstanding, one could argue having LLMs with us every step of the way - coding agents, debugging, devops tools etc. It will be this shared interlocutor with vast swaths of experiential knowledge collected and redistributed at an even larger scale than SO and forum-style platforms allow for.

It does remove the human touch so it's quite a different dynamic and the amount of data to collect is staggering and challenging from a legal point of view, but I suspect a lot of the knowledge used to train LLMs in the next ten years will come from large-scale telemetry and millions of hours in RL self-play where LLMs learn to scale and debug code from fizzbuzz to facebook and twitter-like distributed system.

➕ show 1 reply

brunoborges • yesterday at 1:13 AM

As long as software is properly documented, and documentation is published in LLM-friendly formats, LLMs may be able to answer most of the beyond basic questions even when docs don't explicitly cover a particular scenario.

Take an API for searching products, one for getting product details, and then an API for deleting a product.

The documentation does not need to cover the detailed scenario of "How to delete a product" where the first step is to search, the second step is to get the details (get the ID), and the third step is to delete.

The LLM is capable of answering the question "how to delete the product 'product name'".

To some degree, many of the questions on SO were beyond basic, but still possible for a human to answer if only they read documentation. LLMs just happen to be capable of reading A LOT of documentation a LOT faster, and then coming up with an answer A LOT faster.

➕ show 2 replies

m-schuetz • yesterday at 5:20 AM

> I disagree with most comments that the brusque moderation is the cause of SO's problems

The moderation was precisely the reason I stopped using stackoverflow and started looking for answers and asking questions elsewhere. It was nearly impossible to ask anything without someone replying "Why would you even want to do that, do <something completely different that does not solve my problem> instead!". Or someone claiming it's a duplicate and you should use that ancient answer from another question that 1) barely fits and doesnt solve my problem and 2) is so outdated, it's no longer useful.

Whenever I had to ask something, I had to add a justification as to why I have to do it that way and why previous posts do not solve the issue, and that took more space than the question itself.

I certainly won't miss SO.

➕ show 1 reply

emodendroket • yesterday at 3:32 AM

If we're going to diagnose pre-AI Stack Overflow problems I see two obvious ones:

1. The attempt to cut back on the harshness of moderation meant letting through more low-quality questions.

2. More importantly, a lot of the content is just stale. Like you go to some question and the accepted answer with the most votes is for a ten-year-old version of the technology.

shevy-java • yesterday at 2:44 AM

> The fundamental value proposition of SO is getting an answer to a question

But the horrible moderation was in part a reason why many SO questions had no answers.

I am not saying poor moderation caused all of this, but it contributed negatively and many people were pissed at that and stopped using SO. It is not the only reason SO declined, but there are many reasons for SO failure after its peak days.

➕ show 1 reply

zahlman • yesterday at 5:55 AM

> I disagree with most comments that the brusque moderation is the cause of SO's problems, though it certainly didn't help. SO has had poor moderation from the beginning.

Overwhelmingly, people consider the moderation poor because they expect to be able to come to the site and ask things that are well outside of the site's mission. (It's also common to attribute community actions to "moderators" who in reality have historically done hardly any of it; the site simply didn't scale like that. There have been tens of millions of questions, versus a couple dozen moderators.)

The kinds of questions that people are getting quick, accurate answers for from an LLM are, overwhelmingly, the sort of thing that SO never wanted. Generally because they are specific to the person asking: either that person's issue won't be relevant to other people, or the work hasn't been done to make it recognizable by others.

And then of course you have the duplicates. You would not believe the logic some people put forward to insist that their questions are not duplicate; that they wouldn't be able, in other words, to get a suitable answer (note: the purpose is to answer a question, not solve a problem) from the existing Q&A. It is as though people think they are being insulted when they are immediately given a link to where they can get the necessary answer, by volunteers.

I agree that Reddit played a big role in this. But not just by answering questions; by forming a place where people who objected to the SO content model could congregate.

Insulting other users is and always has been against Stack Overflow Code of Conduct. The large majority of insults, in my experience, come from new users who are upset at being politely asked to follow procedures or told that they aren't actually allowed to use the site the way they're trying to. There have been many duplicate threads on the meta site about why community members (with enough reputation) are permitted to cast close votes on questions without commenting on what is wrong. The consensus: close reasons are usually fairly obvious; there is an established process for people to come to the meta site to ask for more detailed reasoning; and comments aren't anonymous, so it makes oneself a target.

➕ show 2 replies

sotix • yesterday at 2:46 PM

> I disagree with most comments that the brusque moderation is the cause of SO's problems, though it certainly didn't help.

By the time my generation was ready to start using SO, the gatekeeping was so severe that we never began asking questions. Look at the graph. The number of questions was in decline before 2020. It was already doomed because it lost the plot and killed any valuable culture. LLMs were a welcome replacement for something that was not fun to use. LLMs are an unwelcome replacement for many other things that are a joy to engage with.

andirk • yesterday at 12:48 AM

That "Dead Internet" phrase keeps becoming more likely, and this graph shows that. Human-to-human interactions, LLMs using those interactions, less human-to-human interactions because of that, LLMs using... ?

chrischen • yesterday at 7:12 AM

This doesn't mean that it's over for SO. It just means we'll probably trend towards more quality over quantity. Measuring SO's success by measuring number of questions asked is like measuring code quality by lines of code. Eventually SO would trend down simply by advancements of search technology helping users find existing answers rather than asking new ones. It just so happened that AI advanced made it even better (in terms of not having to need to ask redundant questions).

jasonfarnon • yesterday at 8:25 AM

"I suspect that the gradual decline, beginning around 2016, is due to growth in a number of other sources of answers."

I think at least one other reason is that a lot of the questions were already posted. There are only so many questions of interest, until a popular new technology comes along. And if you look at mathoverflow (which wouldnt have the constant shocks from new technologies) the trend is pretty stable...until right around 2022. And even since then, the dropoff isn't nearly so dramatic. https://data.stackexchange.com/mathoverflow/query/edit/19272...

timcobb • yesterday at 12:42 AM

> I wonder if, 10 years from now, LLMs will still be answering questions that were answered in the halcyon 2014-2020 days of SO better than anything that came after?

I've wondered this too and I wonder if the existing corpus plus new GitHub/doc site scrapes will be enough to keep things current.

➕ show 1 reply

noduerme • yesterday at 1:53 PM

>>what happens now?

I'll tell you what happens now: LLMs continue to regurgitate and iterate and hallucinate on the questions and answers they ingested from S.O. - 90% of which are incorrect. LLM output continues to poison itself as more and more websites spring up recycling outdated or incorrect answers, and no new answers are given since no one wants to waste the time to ask a human a question and wait for the response.

The overall intellectual capacity sinks to the point where everything collaboratively built falls apart.

The machines don't need AGI to take over, they just need to wait for us to disintegrate out of sheer laziness, sloth and self-righteous.... /okay.

there was always a needy component to Stack Overflow. "I have to pass an exam, what is the best way to write this algorithm?" and shit like that. A lazy component. But to be honest, it was the giving of information which forced you to think, and research, and answer correctly, which made systems like S.O. worthwhile, even if the questioners were lazy idiots sometimes. And now, the apocalypse. Babel. The total confusion of all language. No answer which can be trusted, no human in the loop, not even a smart AI, just a babbling set of LLMs repeating Stack Overflow answers from 10 years ago. That's the fucking future.

Things are gonna slide / in all directions / won't be nothin you can measure anymore. The blizzard of the world has crossed the threshold and it's overturned the order of the soul.[0]

[0] https://www.youtube.com/watch?v=8WlbQRoz3o4

➕ show 1 reply

BigParm • today at 3:08 AM

The LLMs will learn from our interactions with them. That's why they're often free

brudgers • yesterday at 4:57 AM

[delayed]

cyberrock • yesterday at 8:43 AM

There's another significant forum: GitHub, the rise of which coincided with the start of SO's decline. I bet most niche questions went over to GH repos' issue/discussion forums, and SO was left with more general questions that bored contributors.

jlarocco • yesterday at 3:22 PM

> - I know I'm beating a dead horse here, but what happens now? Despite stratification I mentioned above, SO was by far the leading source of high quality answers to technical questions. What do LLMs train off of now? I wonder if, 10 years from now, LLMs will still be answering questions that were answered in the halcyon 2014-2020 days of SO better than anything that came after? Or will we find new, better ways to find answers to technical questions?

To me this shows just how limited LLMs are. Hopefully more people realize that LLMs aren't as useful as they seem, and in 10 years they're relegated to sending spam and generating marketting websites.

➕ show 1 reply

furyofantares • yesterday at 6:06 PM

> The fundamental value proposition of SO is getting an answer to a question; if you can the same answer faster, you don't need SO.

Plus they might find the answer on SO without asking a new question - You probably would expect the # of new questions to peak or plateau even if the site wasn't dying, due to the accumulation of already-answered questions.

m463 • yesterday at 1:28 AM

Too bad stack overflow didn't high-quality-LLM itself early. I assume it had the computer-related brainpower.

with respect to the "moderation is the cause" thing... Although I also don't buy moderation as the cause, I wonder if any sort of friction from the "primary source of data" can cause acceleration.

for example, when I'm doing an interenet search for the definition of a word like buggywhip, some search results from the "primary source" show:

> buggy whip, n. meanings, etymology and more | Oxford English Dictionary

> Factsheet What does the noun buggy whip mean? There is one meaning in OED's entry for the noun buggy whip. See 'Meaning & use' for definition, usage, and quotation evidence.

which are non-answer to keep their traffic.

but the AI answer is... the answer.

If SO early on had had some clear AI answer + references, I think that would have kept people on their site.

weatherlite • yesterday at 7:07 AM

> What do LLMs train off of now? I wonder if, 10 years from now, LLMs will still be answering questions that were answered in the halcyon 2014-2020 days of SO better than anything that came after? Or will we find new, better ways to find answers to technical questions?

That's a great question. I have no idea how things will play out now - do models become generalized enough to handle "out of distrubition" problems or not ? If they don't then I suppose a few years from now we'll get an uptick in Stackoverflow questions; the website will still exist it's not going anywhere.

sgc • yesterday at 1:47 AM

The newer questions that LLMs can't answer will be answered in forums - either SO, reddit, or elsewhere. There will be a much higher percentage of relevant content with far fewer new pages regurgitating questions about solved problems. So the LLMs will be able to keep up.

nikhizzle • yesterday at 12:54 AM

I think the interesting thing here for those of us who use open source frameworks is that we can ask the LLM to look at the source to find the answer (eg. Pytorch or Phoenix in my case). For closed source libraries I do not know.

dleeftink • yesterday at 4:40 AM

Instead of having chat-interfaces target single developers, moving towards multiplayer interfaces may bring back some of what has been lost--looping in experts or third-party knowledge when a problem is too though to tackle via agentic means.

Now all our interactions are neatly kept in personalised ledgers, bounded and isolated from one another. Whether by design or by technical infeasability, the issue remains that knowledge becomes increasingly bounded too instead of collaborative.

maplethorpe • yesterday at 9:43 AM

> will we find new, better ways to find answers to technical questions?

I honestly don't think they need to. As we've seen so far, for most jobs in this world, answers that sound correct are good enough.

Is chasing more accuracy a good use of resources if your audience can't tell the difference anyway?

DrSiemer • yesterday at 9:28 AM

We'll get to the point where we can mass moderate core knowledge eventually. We may need to hand out extra weight for verified experts and some kind of most-votes-win type logic (perhaps even comments?), but live training data updates will be a massive evolution for language models.

rapidfl • yesterday at 1:28 AM

> SO was by far the leading source of high quality answers to technical questions

We will arrive on most answers by talking to an LLM. Many of us have an idea about we want. We relied on SO for some details/quirks/gotchas.

Example of a common SO question: how to do x in a library or language or platform? Maybe post on the Github for that lib. Or forums.. there are quirky systems like Salesforce or Workday which have robust forums. Where the forums are still much more effective than LLMs.

joe_the_user • yesterday at 1:38 AM

I don't think "good moderation or not" really touches what was happening with SO.

I joined SO early and it had a "gamified" interface that I actually found fun. Putting in effort and such I able to slowly gain karma.

The problem was as the site scaled, the competition to answer a given question became more and more intense and that made it miserable. I left at that point but I think a lot people stayed with dynamic that was extremely unhealthy. (and the quality of accepted questions declined also).

With all this, the moderation criteria didn't have to directly change, it just had to fail to deal with the effects that were happening.

znpy • yesterday at 5:38 AM

> I disagree with most comments that the brusque moderation is the cause of SO's problems

Just to add another personal data point: i started posting in on StackOverflow well before llms were a thing and moderation instantly turned ne off and i immediately stopped posting.

Moderators used to edit my posts and reword what i wrote, which is unacceptable. My posts were absolutely peaceful and not inflammatory.

Moderation was an incredible problem for stack overflow.

➕ show 1 reply

camhart • yesterday at 5:20 AM

I stopped because of moderators. They literally killed the site for me.

xz0r • yesterday at 7:46 AM

> I disagree with most comments that the brusque moderation is the cause of SO's problems

Questions asked on SO that got downvoted by the heavy handed moderation would have been answered by LLMs without any of the flak whatsoever.

Those who had downvoted other's questions on SO for not being good enough, must be asking a lot of such not good enough questions to an LLM today.

Sure, the SO system worked, but it was user hostile and I'm glad we all don't have to deal with it anymore.

cletus • yesterday at 5:30 AM

As an early user of SO [1], I feel reasonably qualified to discuss this issue. Note that I barely posted after 2011 or so so I can't really speak to the current state.

But what I can say is that even back in 2010 it was obvious to me that moderation was a problem, specifically a cultural problem. I'm really talking about the rise of the administrative/bureaucratic class that, if left unchecked, can become absolute poison.

I'm constantly reminded of the Leonard Nimoy voiced line from Civ4: "the bureaucracy is expanding to meet the needs of the expanding bureaucracy". That sums it up exactly. There is a certain type of person who doesn't become a creator of content but rather a moderator of content. These are people who end up as Reddit mods, for example.

Rules and standards are good up to a point but some people forget that those rules and standards serve a purpose and should never become a goal unto themselves. So if the moderators run wild, they'll start creating work for themselves and having debates about what's a repeated question, how questions and answers should be structured, etc.

This manifested as the war of "closed, non-constructive" on SO. Some really good questions were killed this way because the moderators decided on their own that a question had to have a provable answer to avoid flame wars. And this goes back to the rules and standards being a tool not a goal. My stance was (and is) that shouldn't we solve flame wars when they happen rather than going around and "solving" imaginary problems?

I lost that battle. You can argue taht questions like "should I use Javascript or Typescript?" don't belong on SO (as the moderators did). My position was that even though there's no definite answer, somebody can give you a list of strengths and weaknesses and things to consider.

Even something that does have a definite answer like "how do I efficiently code a factorial function?" has multiple but different defensible answers. Even in one language you can have multiple implementations that might, say, be compile-time or runtime.

Another commenter here talked about finding the nearest point on an ellipse and came up with a method they're proud of where there are other methods that would also do the job.

Anyway, I'd occasionally login and see a constant churn on my answers from moderators doing pointless busywork as this month they'd decided something needed to be capitalized or not capitalized.

A perfect example of this kind of thing is Bryan Henderson's war on "comprised of" on Wikipedia [2].

Anyway, I think the core issue of SO was that there was a lot of low-hanging fruit and I got a lot of accepted answers on questions that could never be asked today. You'll also read many anecdotes about people having a negative experience asking questions on SO in later years where their question was immediately closed as, say, a duplicate when the question wasn't a duplicate. The moderator just didn't understand the difference. That sort of thing.

But any mature site ultimately ends with an impossible barrier to entry as newcomers don't know all the cultural rules that have been put in place and they tend to have a negative experience as they get yelled at for not knowing that Rule 11.6.2.7 forbids the kind of question they asked.

[1]: https://stackoverflow.com/users/18393/cletus

[2]: https://www.npr.org/2015/03/12/392568604/dont-you-dare-use-c...

➕ show 3 replies

lofaszvanitt • yesterday at 7:02 AM

Google also played a part. After a while, I noticed that for my programming related questions, almost no SO discussions showed up. When they did appear on the first page, they were usually abysmal and unusable for me.

When it started all kinds of very clever people were present and helped even with very deep and complex questions and problems. A few years later these people disappeared. The moderation was ok in the beginning, then they started wooing away a lot of talented people. And then the mods started acting like nazis, killing discussions, proper questions on a whim.

And then bots (?) or karma obsessed/farming people started to upvote batshit crazy, ridiculous answers, while the proper solution had like 5 upvotes and no green marker next to it.

It was already a cesspool before AI took over and they sold all their data. Initial purpose achieved.

kurtis_reed • yesterday at 1:11 AM

Moderation got worse over time

thih9 • yesterday at 2:30 PM

> What do LLMs train off of now?

Perhaps they’ll rely on what was used by people who answered SO questions. So: official docs and maybe source code. Maybe even from experience too, i.e. from human feedback and human written code during agentic coding sessions.

> The fact that the LLM doesn't insult you is just the cherry on top.

Arguably it does insult even more, just by existing alone.

alt Hacker News

Replies