I spent the last 14 days chasing an issue with a Spark transform. Gemini and Claude were exceptionally good at giving me answers that looked perfectly reasonable: none of them worked, they were almost always completely off-road.
Eventually I tried with something else, and found a question on stackoverflow, luckily with an answer. That was the game changer and eventually I was able to find the right doc in the Spark (actually Iceberg) website that gave me the final fix.
This is to say that LLMs might be more friendly. But losing SO means that we're getting an idiot friendly guy with a lot of credible but wrong answers in place of a grumpy and possibly toxic guy which, however, actually answered our questions.
Not sure why someone is thinking this is a good thing.
Not a big surprise once LLMs came along: stack overflow developed some pretty unpleasant traits over time. Everything from legitimate questions being closed for no good reason (or being labeled a duplicate even though they often weren’t), out of date answers that never get updated as tech changes, to a generally toxic and condescending culture amongst the top answerers. For all their flaws, LLMs are so much better.
They will no doubt blame this on AI, somehow (ChatGPT release: late 2022, decline start: mid 2020), instead of the toxicity of the community and the site's goals of being a knowledgebase instead of a QA site despite the design.
PS - This comment is closed as a [duplicate] of this comment: https://news.ycombinator.com/item?id=46482620
Actual analysts here that have looked at this graph like... a lot, so let me contextualize certain themes that tend to crop up from these:
- The reduction of questions over time is asymptomatic of SO. When you have a library of every question asked, at some point, you asked most of the easy questions. Have a novel question becomes hard. - This graph is using the Posts table, not PostsWithDeleted. So, it only tells you of the questions that survived at this point in time, this [0] is the actual graph which while describes a curve that shows the same behavior, it's more "accurate" of the actual post creation. - This is actually a Good Thing™. For years most of the questions went unanswered, non-voted, non-commented, just because there was too many questions happening all the time. So the general trend is not something that the SO community needs to do anything about. Almost 20% of every question asked is marked as duplicate. If people searched... better™ they wouldn't ask as many questions, and so everyone else had more bandwidth to deal with the rest. - There has been a shift in help desk style of request, where people starting to prefer discord and such to get answers. This is actually a bad thing because that means that the knowledge isn't public nor indexed by the world. So, information becomes harder to find, and you need to break it free from silos. - The site, or more accurately, the library will never die. All the information is published in complete archives that anyone can replicate and restart if the company goes under or goes evil. So, yeah, such concerns, while appreciated, are easily addressed. At worst, you would be losing a month or two of data.
[0]: https://data.stackexchange.com/stackoverflow/query/edit/1926...
I guess I'm the only one that was a fan of SO's moderation. I never got too deep into it (answered some TypeScript questions). But the intention to reduce duped questions made a lot of sense to me. I like the idea of a "living document" where energy is focused on updating and improving answers to old versions of the same question. As a user looking for answers it means I can worry less about finding some other variation of the same question that has a more useful answer
I understand some eggs got cracked along the way to making this omelette but overall I'd say about 90% of the time I clicked on a SO link I was rewarded with the answer I was looking for.
Just my two cents
Those saying that StackOverflow became toxic are absolutely correct. But we should not let that be it's legacy. It is IMO still today one of the greatest achievements in terms of open data on the internet. And it's impact on making programming accessible to a large audience cannot be understated.
I once published a method for finding the closest distance between an ellipse and a point on SO: https://stackoverflow.com/questions/22959698/distance-from-g...
I consider it the most beautiful piece of code I've ever written and perhaps my one minor contribution to human knowledge. It uses a method I invented, is just a few lines, and converges in very few iterations.
People used to reach out to me all the time with uses they had found for it, it was cited in a PhD and apparently lives in some collision plugin for unity. Haven't heard from anyone in a long time.
It's also my test question for LLMs, and I've yet to see my solution regurgitated. Instead they generate some variant of Newtons method, ChatGPT 5.2 gave me an LM implementation and acknowledged that Newtons method is unstable (it is, which is why I went down the rabbit hole in the first place.)
Today I don't know where I would publish such a gem. It's not something I'd bother writing up in a paper, and SO was the obvious place were people who wanted an answer to this question would look. Now there is no central repository, instead everyone individually summons the ghosts of those passed in loneliness.
Many users left because they had had overly strict moderation for posting your questions. I have 6k reputation, multiple gold badges and I will remember StackOverflow as a hostile place to ask a questions, honestly. There were multiple occasions when they actually prevented me from asking, and it was hard to understand what exactly went wrong. To my understanding, I asked totally legit questions, but their asking policy is so strict, it's super hard to follow.
So "I'm not happy he's dead, but I'm happy he's gone" [x]
As one of my good friends pointed out back in 2012, most people don't know how to ask questions[0].
I'm feeling a bit sorry for zahlman in the comment section here, they're doing a good job of defending SO to a comment section that seems to want SO to bend to their own whims, no matter what the stated aims and goals of SO really were. There does seem to be a lot of people in the comments here who wanted SO to be a discussion site, rather than the Q&A site that it was set out to be.
I do think it's very unfair of many of you who are claiming SO was hostile or that they unfairly closed questions without bringing the citations required. I'm not saying at all that SO was without it's flaws in leadership, moderators, community or anything else that made the site what it was. But if you're going to complain, at least bring examples, especially when you have someone here you could hold somewhat accountable.
The problem is, you still see a lot of it today, whether it's in IRC channels, Discord chats, StackOverflow or GitHub issues. People still don't know how to ask questions:
* [1] * [2] * [3]
[0]: https://blog.adamcameron.me/2012/12/need-help-know-how-to-as... [1]: https://github.com/swagger-api/swagger-ui/issues/10670 [2]: https://github.com/swagger-api/swagger-ui/issues/10649 [3]: https://github.com/usebruno/bruno/issues/6515
I feel like I'm taking crazy pills, reading some of these comments.
It looks like a pretty clear divide between the people that wanted to ask questions to get solutions for their own specific problems; and those who were aware of what the site wanted to be and how it actually operated, and were willing to put in the time and answer questions, etc.
The sheer amount of garbage that used to get posted every day required some pretty heavy moderation. Most of it was not by actual moderators, it was by high-reputation users.
(I have 25K reputation on StackOverflow, and was most active between 2011 and 2018.)
The graph is scary, but I think it's conflating two things:
1. Newbies asking badly written basic questions, barely allowed to stay, and answered by hungry users trying to farm points, never to be re-read again. This used to be the vast majority of SO questions by number.
2. Experiencied users facing a novel problem, asking questions that will be the primary search result for years to come.
It's #1 that's being canibalized by LLM's, and I think that's good for users. But #2 really has nowhere else to go; ChatGPT won't help you when all you have is a confusing error message caused by the confluence of three different bugs between your code, the platform, and an outdated dependency. And LLMs will need training data for the new tools and bugs that are coming out.
While AI might have amplified the end, the drop-off preceded significant AI usage for coding.
So some possible reasons:
- Success: all the basic questions were answered, and the complex questions are hard to ask.
- Ownership: In its heyday, projects used SoF for their support channel because it meant they don't have to answer twice. Now projects prefer to isolate dependencies to github and not lose control over messaging to over-eager users.
- Incentives: Good SoF karma was a distinguishing feature in employment searches. Now it wouldn't make a difference, and is viewed as being too easy to scam
- Demand: Fewer new projects. We're past the days of Javascript and devops churn.
- Community: tight job markets make people less community-oriented
Some non-reasons:
- Competition (aside from AI at the end): SoF pretty much killed the competition in that niche (kind of like craigslist).
Seems like the exit was very well timed.
https://meta.stackoverflow.com/questions/408138/what-will-ha...
The obvious culprit here are the LLMs, but I do wonder whether Github's social features, despite its flaws, have given developers fewer reasons to ask questions on SO?
Speaking from experience, every time I hit a wall with my projects, I would instinctively visit the project's repo first, and check on the issues / discussions page. More often than not, I was able to find someone with an adjacent problem and get close enough to a solution just by looking at the resolution. If it all failed, I would fall back to asking questions on the discussion forum first before even considering to visit SO.
Somewhere out there, there's an alternate universe in which the Stackoverflow community was so friendly, welcoming, helpful, and knowledgeable that this seems like a tragedy and motivates people to try to save it.
But in this universe, most people's reaction is just "lol".
Lots of the comments here are attributing the decline to a toxic community or overly-strict moderation, but I don't think that that is the main reason. The TeX site [0] is very friendly and has somewhat looser moderation, yet it shows the exact same decline [1].
This is horrifying.
Given the fact that when I need a question answered I usually refer to S.O., but more recently have taken suggestions from LLM models that were obviously trained on S.O. data...
And given the fact that all other web results for "how do you change the scroll behavior on..." or "SCSS for media query on..." all lead to a hundred fake websites with pages generated by LLMs based on old answers.
Destroying S.O. as a question/answer source leaves only the LLMs to answer questions. That's why it's horrific.
I used to contribute a ton to Stack Overflow at the beginning in 2009 and 2010 and then stopped cold turkey. One of the senior product execs emailed me to see what turned me off.
What killed it for me was community moderation. People who cannot contribute with quality content will attempt to contribute by improperly and excessively applying their opinion of what is allowed.
Unfortunately it happens to every online technical community once they become popular enough. I even see it happening on HN.
As someone that spent a fair bit of time answering questions on StackOverflow, what stood out years ago was how much the same thing would be asked every day. Countless duplicates. That has all but ceased with LLMs taking all that volume. Honestly, I don't think that's a huge loss for the knowledge base.
The other thing I've noticed lately is a strong push to get non-programming questions off StackOverflow, and on to other sites like SuperUser, ServerFault, DevOps, etc.
Unfortunately, what's left is so small I don't think there's enough to sustain a community. Without questions to answer, contributors providing the answers disappear, leaving the few questions there often unanswered.
I joined Stackoverflow early on since it had a prevalence towards .NET and I’ve been working with Microsoft web technologies since the mid 90’s.
My SO account is coming up to 17 years old and I have nearly 15,000 points, 15 gold badges, including 11 famous questions and similar famous answer badges, also 100 silver and 150 bronze. I spent far much time on that site in the early days, but through it, I also thoroughly enjoyed helping others. I also started to publish articles on CodeProject and it kicked off my long tech blogging “career”, and I still enjoy writing and sharing knowledge with others.
I have visited the site maybe once a year since 2017. It got to the point that trying to post questions was intolerable, since they always got closed. At this point I have given up on it as a resource, even though it helped me tremendously to both learn (to answer questions) and solve challenging problems, and get help for edge cases, especially on niche topics. For me it is a part of my legacy as a developer for over 30 years.
I find it deeply saddening to see what it has become. However I think Joel and his team can be proud of what they built and what they gave to the developer community for so many years.
As a side note it used to state that was in the top 2% of users on SO, but this metric seems to have been removed. Maybe it’s just because I’m on mobile that I can’t see it any more.
LLM’s can easily solve those easy problems that have high commonality across many codebases, but I am dubious that they will be able to solve the niche challenging problems that have not been solved before nor written about. I do wonder how those problems get solved in the future.
I do use Claude a lot, but I still regularly ask questions on https://bioinformatics.stackexchange.com/. It's often just too niche, LLMs hallucinate stuff like an entire non-existent benchmarking feature in Snakemake, or can't explain how I should get transcriptome aligners to give me correct quantifications for a transcript. I guess it's too niche. And as a lonely Bioinformatician it can be nice to get confirmation from other bioinformaticians.
Looking back at my Stack Exchange/Stack Overflow (never really got the difference) history, my earlier, more general programming questions from when I just started are all no-brainers for any LLM.
StackOverflow was a pub where programmers had fun while learning programming. The product of that fun was valuable.
Instead of cultivating the pub, the owners demanded that the visitors be safe, boring and obedient witers of value. This killed the pub and with it the business.
The most visible aspect was the duplicate close. Duplicate closes scare away fresh patrons, blocking precisely the path that old timers took when they joined. And duplicates allow anyone with a grudge to take revenge. After all, there are no new questions, and you will always find a duplicate if you want to.
To create a new Stack Overlflow, create a pub where programmers enjoy drinking a virtual beer, and the value will appear by itself.
I used to joke that when SO goes under, I will move professions. The joke came from my experience of how many common issues in technology could not be solved with knowledge found via a search engine. I don’t see that niche as gone, so I wonder what is satisfying that requirement such that new questions do not show up at SO?
I was tasked to add OpenOffice's hyphenation lib to our software at work back in 2010 when I was a junior dev. I had to read the paper and the C code/documentation to understand how it works but got stuck in one particular function.
It was such an obscure thing (compare to web dev stuffs) that I couldn't find anything on Google.
Had no choice but to ask on Stackoverflow and expected no answers. To my surprise, I got a legit answer from someone knowledgable, and it absolutely solve my problem at the time. (The function has to do with the German language, which was why I didn't understand the documentation)
It was a fond memory of the site for me.
Interestingly, stagnation started around 2014 (in the number of questions asked no longer rising,) and a visible decline started in 2020 [1]: two years before ChatGPT launched!
It’s an interesting question if the decline would have happened regardless of LLMs, just slower?
[1] An annotated visualization of the same data I did: https://blog.pragmaticengineer.com/are-llms-making-stackover...
A lesson can be learned here. If you don't introduce some form of accountability for everyone that influences the product, it eventually falls apart. The problem, as we all know now, is that the moderators screwed things up, and there were no guardrails in place to stop them from killing the site. A small number of very unqualified moderators vandalized the place and nobody with common sense stepped in to put an end to it.
The decline is not surprising. I am sure AI is replacing Stackoverflow for a lot of people. And my experience with asking questions was pretty bad. I asked a few very specific questions about some deep detail in Windows and every time I got only some smug comments about my stupid question or the question got rejected outright. That while a ton of beginner questions were approved. Definitely not a very inviting club. I found i got better responses on Reddit.
This is a huge loss.
In the past people asked questions of real people who gave answers rooted in real use. And all this was documented and available for future learning. There was also a beautiful human element to think that some other human cared about the problem.
Now people ask questions of LLMs. They churn out answers from the void, sometimes correct but not rooted in real life use and thought. The answers are then lost to the world. The learning is not shared.
LLMs have been feeding on all this human interaction and simultaneously destroying it.
LLMs caused this decline. Stop denying that. You don't have to defend LLMs from any perceived blame. This is not a bad thing.
The steep decline in the early months of 2023 actually started with the release of ChatGPT, which is 2022-11-30, and its gradually widening availability to (and awareness of) the public from that date. The plot clearly shows that cliff.
The gentle decline since 2016 does not invalidate this. Were it not for LLMs, the site's post rate would now probably be at around 5000 posts/day, not 300.
LLMs are to "blame" for eating all the trivial questions that would have gotten some nearly copy-pasted answer by some eager reputation points collector, or closed as a duplicate, which nets nobody any rep.
Stack Overflow is not a site for socializing. Do not mistake it for reddit. The "karma" does not mean "I hate you", it means "you haven't put the absolute minimum conceivable amount of effort into your question". This includes at least googling the question before you ask. If you haven't done that, you can't expect to impose on the free time of others.
SO has a learning curve. The site expects more from you than just to show up and start yapping. That is its nature. It is "different" because it must be. All other places don't have this expectation of quality. That is its value proposition.
Some commenters suggest it's not the moderation. I think it is the key problem, and the alternative communities were the accumulated effect. Bad questions and tough answer competition is part of it, but moderation was more important, I think. Because in the end what kept SO relevant was that people made their own questions on up to date topics.
Up until mid-2010s you could make a seriously vague question, and it would be answered, satisfactory or not. (2018 was when I made the last such question. YMMV) After that, almost everything, that hadn't snap-on code answer, was labelled as offtopic or duplicate, and closed, no matter what. (Couple of times I got very rude moderators' comments on the tickets.)
I think this lead some communities to avoid this moderator hell and start their own forums, where you could afford civilized discussion. Discourse is actually very handy for this (Ironically, it was made by the same devs that created SO). Forums of the earlier generation, have too many bells and whistles, and outdated UI. Discourse has much less friction.
Then, as more quality material was accumulated elsewhere, newbies stopped seeing SO on top of search, and gradually language/library communities churned off one by one. (AI and other summaries, probably did contribute, but I don't think they were the primary cause.)
SO has lost against LLMs because it has insistently positioned itself as a knowledge base rather than a community. The harsh moderation, strict content policing, forbidden socialization, lack of follow mechanics etc have all collectively contributed to it.
They basically made a bet because they wanted to be the full anti-thesis of ad-ridden garbage-looking forums. Pure information, zero tolerance for humanity, sterile looking design.
They achieved that goal, but in the end, they dug their own grave too.
LLMs didn’t admonish us to write our questions better, or simply because we asked for an opinion. They didn’t flag, remove our post with no advance notice. They didn’t forbid to say hello or thanks, they welcomed it. They didn’t complain when we asked something that was asked many times. They didn’t prevent us from deleting our own content.
Oh yeah, no wonder nobody bothers with SO anymore.
It’s a good lesson for the future.
One thing you won’t get with in an LLM is genuine research. I once answered a 550 point question by researching the source code of vim to see how the poster’s question could be resolved. [0]
[0] https://stackoverflow.com/questions/619423/backup-restore-th...
I recall when they disabled the data export a few years ago [0], March 2023. Almost certainly did this in response to the metrics they were seeing, but it accelerated the decline [1].
[0] https://meta.stackexchange.com/questions/389922/june-2023-da...
[1] https://data.stackexchange.com/stackoverflow/query/edit/1926...
Do I read that correctly — it is close to zero today?!
I used to think SO culture was killing it but it really may have been AI after all.
SO was built to disrupt the marriage of Google and Experts Exchange. EE was using dark patterns to sucker unsuspecting users into paying for access to a crappy Q&A service. SO wildly succeeded, but almost 20 years later the world is very different.
So the question for me is how important was SO to training LLMs? Because now that the SO is basically no longer being updated, we've lost the new material to train on? Instead, we need to train on documentation and other LLM output. I'm no expert on this subject but it seems like the quality of LLMs will degrade over time.
One factor I haven't seen mentioned is the catastrophic decline in quality of Google search. That started pre-llm and now the site is almost unusable to search web. You can access something you know exists and you know where it exists, but to actually search..?
Most SO users are passive readers who land there using search, but these readers are also the feed of new active users. Cut off the influx, and the existing ones will be in decline (the moderation just accelerates it).
SO peaked long, long before LLMs came along. My personal experience is that GitHub issues took over.
You can clearly see the introduction of ChatGPT in late 2022. That was the final nail in the coffin.
I am still really glad that Stack Overflow saved us from experts-exchange.com - or “the hyphen site” as it is sometimes referred to.
As everyone is saying, it was already down-trending before AI, and probably experts exchange traffic and whatever came before looks similar
Also not sure exactly when they added the huge popup[0] that covers the answer (maybe only in Europe as it's about cookies?) but that's definitely one of the things that made me default reach for other links instead of SO.
Here’s how SO could still be useful in the LLM era:
User asks a question, llm provides an immediate answer/reply on the forum. But real people can still jump in to the conversation to add additional insights and correct mistakes.
If you’re a user that asks a duplicate question, it’ll just direct you to the good conversation that already happened.
A symbiosis of immediate usually-good-enough llm answers PLUS human generated content that dives deeper and provides reassurances in correctness
For this occasion, I just logged in to my SO profile; I've been a member for 9 years now.
To me, back when I started out learning web dev, as a junior with no experience and barely knowing anything, SO seemed like a paradise for programmers. I could go on there and get unblocked for the complex (but trivial for experts) issues I was facing. Most of the questions I initially posted, which were either closed as duplicates or "not good enough," really did me a lot of discouragement. I wasn't learning anything by being told, "You did it wrong, but we're also not telling you how you could do it better." I agree with the first part; I probably sucked at writing good questions and searching properly. I think it's just a part of the process to make mistakes but SO did not make it better for juniors, at least on the part of giving proper guidance to those who "sucked".
StackExchange forgot who made them successful long ago. This is what they sowed. I don't have any remorse, only pity.
When Hans Passant (OGs will know) left, followed by SE doing literally nothing, that was the first clue for me personally that SE stopped caring.
That said, it is a bit shocking how close to zero it is.
The corresponding answers graph: https://data.stackexchange.com/stackoverflow/query/1927992/a...
Good riddance. There were some ok answers there, but also many bad or obsolete answers (leading to scrolling down find to find the low-ranked answer that sort of worked), and the moderator toxicity was just another showcase of human failure on top of that. It selected for assholes because they thought they had a captive, eternally renewing audience that did not have any alternative.
And that resulted in the chilling effect of people not asking questions because they didn't want to run the moderation gauntlet, so the site's usefulness went even further down. Its still much less useful for recent tech, than it is for ancient questions about parsing HTML with regex and that sort of thing.
LLMs are simply better in every way, provided they are trained on decent documents. And if I want them to insult me too, just for that SO nostalgia, I can just ask them to do that and they will oblige.
Looking forward to forgetting that site ever existed, my brain's health will improve.
Stackoverflow is like online gaming--lots of toxic people, but I still get value out of it. Ignore the toxic people, get your questions answered and go home to your family with your paycheck.
Stackoverflow bureaucracy and rule mongering are insane. I recommend participation just to behold the natives in their biom. Its like a small european union laser focused on making asking snd answering a question the largest pain point of a site that is mainly about asking and answering questions.
AI is a vampire. Coming to your corner of the world, to suck your economic blood, eventually. It’s hard to ignore the accelerated decline that started in late 2022/early 2023.
I have a SO profile and I both contributed and used the site for some time.
I use the site from time to time to research something. I know a lot more about software than 15 years ago.
I used to ask questions and answer questions a lot, but after I matured I have no time and whatever I earn is not worth my time.
So perhaps the content would grow in size and quality if they rewarded users with something besides XP.
I don't use AI for research so far. I use AI to implement components that fit my architecture and often tests of components.
LLMs absolutely body-slammed SO, but anyone who was an active contributor knows the company was screwing over existing moderators for years before this. Writing was on the walls
Some comments:
- This is a really remarkable graph. I just didn't realize how thoroughly it was over for SO. It stuns me as much as when Encyclopædia Britannica stopped selling print versions a mere 9 years after the publication of Wikipedia, but at an even faster timescale.
- I disagree with most comments that the brusque moderation is the cause of SO's problems, though it certainly didn't help. SO has had poor moderation from the beginning. The fundamental value proposition of SO is getting an answer to a question; if you can the same answer faster, you don't need SO. I suspect that the gradual decline, beginning around 2016, is due to growth in a number of other sources of answers. Reddit is kind of a dark horse here, as I began seeing answers on Google to more modern technical questions link to a Reddit thread frequently along with SO from 2016 onwards. I also suspect Discord played a part, though this is harder to gauge; I certainly got a number of answers to questions for, e.g., Bun, by asking around in the Bun Discord, etc. The final nail in the coffin is of course LLMs, which can offer a SO-level answer to a decent percentage of questions instantly. (The fact that the LLM doesn't insult you is just the cherry on top.)
- I know I'm beating a dead horse here, but what happens now? Despite stratification I mentioned above, SO was by far the leading source of high quality answers to technical questions. What do LLMs train off of now? I wonder if, 10 years from now, LLMs will still be answering questions that were answered in the halcyon 2014-2020 days of SO better than anything that came after? Or will we find new, better ways to find answers to technical questions?