I just loaded the nytimes.com page as an experiment. The volume of tracking pixels and other ad non-sense is truly horrifying.
But at least in terms of the headline metric of bandwidth, it's somewhat less horrifying. With my ad-blocker off, Firefox showed 44.47mb transferred. Of that 36.30mb was mp4 videos. These videos were journalistic in nature (they were not ads).
So, yes in general, this is like the Hindenburg of web pages. But I still think it's worth noting that 80% of that headline bandwidth is videos, which is just part of the site's content. One could argue that it is too video heavy, but that's an editorial issue, not an engineering issue.
These days the NYT is in a race to the bottom. I no longer even bother to bypass ads let alone read the news stories because of its page bloat and other annoyances. It's just not worth the effort.
Surely news outlets like the NYT must realize that savvy web surfers like yours truly when encountering "difficult" news sites—those behind firewalls and or with megabytes of JavaScript bloat—will just go elsewhere or load pages without JavaScript.
We'll simply cut the headlines from the offending website and past it into a search engine and find another site with the same or similar info but with easier access.
I no longer think about it as by now my actions are automatic. Rarely do I find an important story that's just limited to only one website, generally dozens have the story and because of syndication the alternative site one selects even has identical text and images.
My default browsing is with JavaScript defaulted to "off" and it's rare that I have to enable it (which I can do with just one click).
I never see Ads on my Android phone or PC and that includes YouTube. Disabling JavaScript on webpages nukes just about all ads, they just vanish, any that escape through are then trapped by other means. In ahort, ads are optional. (YouTube doesn't work sans JS, so just use NewPipe or PipePipe to bypass ads.)
Disabling JavaScript also makes pages blindingly fast as all that unnecessary crap isn't loaded. Also, sans JS it's much harder for websites to violate one's privacy and sell one's data.
Do I feel guilty about skimming off info in this manner? No, not the slightest bit. If these sites played fair then it'd be a different matter but they don't. As they act like sleazebags they deserve to be treated as such.
Modern web dev is ridiculous. Most websites are an ad ridden tracking hellacape. Seeing sites like hn where lines of js are taken seriously is a godsend. Make the web less bloated.
My family's first broadband internet connection, circa 2005, came with a monthly data quota of 400 MB.
The fundamental problem of journalism is that the economics no longer works out. Historically, the price of a copy of a newspaper barely covered the cost of printing; the rest of the cost was covered by advertising. And there was an awful lot of advertising: everything was advertised in newspapers. Facebook Marketplace and Craigslist were a section of the newspaper, as was whichever website you check for used cars or real estate listings. Journalism had to be subsidised by advertising, because most people aren't actually that interested in the news to pay the full cost of quality reporting; nowadays, the only newspapers that are thriving are those that aggressively target those who have an immediate financial interest in knowing what's going on: the Financial Times, Bloomberg, and so on.
The fact is that for most people, the news was interesting because it was new every day. Now that there is a more compelling flood of entertainment in television and the internet, news reporting is becoming a niche product.
The lengths that news websites are going to to extract data from their readers to sell to data brokers is just a last-ditch attempt to remain profitable.
I also use and like the comparison in units of Windows 95 installs (~40MB), which is also rather ironic in that Win95 was widely considered bloated when it was released.
While this article focuses on ads, it's worth noting that sites have had ads for a long time, but it's their obnoxiousness and resource usage that's increased wildly over time. I wouldn't mind small sponsored links and (non-animated!) banners, but the moment I enable JS to read an article and it results in a flurry of shit flying all over the page and trying to get my attention, I leave promptly.
This is just the top of the iceberg. Don't get me started on airlines websites (looking at you Air Canada), where the product owner, designers, developers are not able to get a simple workflow straight without loading Mb of useless javascript and interrupt the user journey multiple times. Give me back the command line terminal like Amadeus, that would be perfect.
How can we go back to a Web where websites are designed to be used by the user and not for the shareholders?
Allowing scripting on websites (in the mid-90s) was a completely wrong decision. And an outrage. Programs are downloaded to my computer and executed without me being able to review them first—or rely on audits by people I trust. That’s completely unacceptable; it’s fundamentally flawed. Of course, you disable scripts on websites. But there are sites that are so broken that they no longer work properly, since the developers are apparently so confused that they assume people only view their pages with JavaScript enabled.
It would have been so much better if we had simply decided back in the ’90s that executable programs and HTML don’t belong together. The world would be so much better today.
It's really hard to consider any kind of web dev as "engineering." Outcomes like this show that they don't have any particular care for constraints. It's throw-spaghetti-at-the-wall YOLO programming.
It's almost criminal that the article does not mention network-wide DNS blocklists as an obvious solution to this problem. I stop nearly 100% of ads in their tracks using the Hagezi ultimate list, and run uBlock on desktop for cosmetic filtering and YouTube.
I should really run some to tests to figure out how much lighter the load on my link is thanks to the filter.
I also manually added some additional domains (mostly fonts by Google and Adobe) to further reduce load and improve privacy.
This is why people continue to lament Google Reader (and RSS in general): it was a way to read content on your own terms, without getting hijacked by ads.
This rubbish also exists disproportionately for recipe pages/cooking websites as well.
You have 20 ads scattered around, an autoplaying video of some random recipe/ad, 2-3 popups to subscribe, buy some affiliated product and then the author's life story and then a story ABOUT the recipe before I am able to see the detailed recipe in the proper format.
It's second nature to open all these websites in reader mode for me atp.
The article says "I don't know where this fascination with getting everyone to download your app comes from."
The answer is really simple and follows on from this article; the purpose of the app is even more privacy violation and tracking.
Even enterprise COTS products can have some of these issues. We have an on-premise Atlassian suite, and Jira pages sometimes have upwards of 30MB total payloads for loading a simple user story page — and keep in mind there is no ad-tech or other nonsense going on here, it’s just pure page content.
I remember in 2008, when Wizards of the Coast re-launched the official Dungeons & Dragons website to coincide with the announcement of the fourth edition rules. The site was something in the region of 4 MB, plus a 20 MB embedded video file. A huge number of people were refreshing the site to see what the announcement was, and it was completely slammed. Nobody could watch the trailer until they uploaded it to YouTube later.
4 MB was an absurd size for a website in 2008. It's still an absurd size for a website.
This site more or less practices what it preaches. `newsbanner.webp` is 87.1KB (downloaded and saved; the Network tab in Firefox may report a few times that and I don't know why); the total image size is less than a meg and then there's just 65.6KB of HTML and 15.5 of CSS.
And it works without JavaScript... but there does appear to be some tracking stuff. A deferred call out to Cloudflare, a hit counter I think? and some inline stuff at the bottom that defers some local CDN thing the old-fashioned way. Noscript catches all of this and I didn't feel like allowing it in order to weigh it.
When working at the BBC in the late 90s, the ops team would start growling at you if a site's home page was over 70kb...
Only major media can get away with this kind of bloat. For the normal website, Google would never include you in the SERPs even if your page is a fraction of that size.
Let's play a fun prediction: I ask HN readers what will be the page size of NYTimes.com in 10 years? Or 20 years?
Want to bet 100 MB? 1 GB? Is it unthinkable?
20 years ago, a 49 MB home page was unthinkable.
I think it's a GOOD thing, actually. Because all these publications a dying anyway. And even if your filter out all the ad and surveillance trash, you are left with trash propaganda and brain rot content. Like why even make the effort of filtering out the actual text from some "journalist" from these propaganda outlets. It's not even worth it.
If people tune out only because how horrible the sites are, good.
rule #1 is to always give your js devs only core 2 quad cpus + 16GB of RAM
they won't be able to complain about low memory but their experience will be terrible every time they try to shove something horrible into the codebase
I hate this trend of active distraction. Most blogs have a popup asking you to subscribe as soon as you start scrolling.
It’s as if everyone designed their website around the KPI of irritating your visitors and getting them to leave ASAP.
and the NYT web team was praised as one of the best in the world some (many?) years ago.
49mb web page? Try a 45mb graphql response.
Ublock origin helps mitigate at the least a little bit here.
I worked at big newspapers as a software engineer. Please do not blame the engineers for this mess. As the article says news is in a predicament because of the ads business model. Subscriptions alone usually cannot cover all costs and ads will invariably make their way in.
For every 1 engineer it seems like there are 5 PMs who need to improve KPIs somehow and thus decide auto playing video will improve metrics. It does. It also makes people hate using your website.
I would constantly try to push back against the bullshit they'd put on the page but no one really cares what a random engineer thinks.
I don't think there's any real way to solve this unless we either get less intrusive ad tech or news gets a better business model. Many sites don't even try with new business models, like local classifieds or local job boards. And good luck getting PMs to listen to an engineer talking about these things.
For now, the bloat remains.
Maybe I'm just getting old, but I've gotten tired of these "Journalists shouldn't try to make their living by finding profitable ads, they should just put in ads that look pretty but pay almost nothing and supplement their income by working at McDonalds" takes.
Our developers managed to run around 750MB per website open once.
They have put in ticket with ops that the server is slow and could we look at it. So we looked. Every single video on a page with long video list pre-loaded a part of it. The single reason the site didn't ran like shit for them is coz office had direct fiber to out datacenter few blocks away.
We really shouldn't allow web developers more than 128kbit of connection speed, anything more and they just make nonsense out of it.