In all fairness, running modest to large MediaWiki instances isn't easy. There's a lot of things that are not immediately obvious:
- For anything complex/large enough you have to set `$wgMiserMode` otherwise operations will just get way too long and start timing out.
- You have to set `$wgJobRunRate` to 0 or a bunch of requests will just start stalling when they get assigned to calculate an expensive task that takes a lot of memory. Then you need to set up a separate job runner in the background, which can consume a decent amount of memory itself. There is nowadays a Redis-based job queue, but there doesn't seem to be a whole lot of documentation.
- Speaking of Redis, it seems like setting up Redis/Memcached is a pretty good idea too, for caching purposes; this especially helps for really complicated pages.
Even to this day running a Wiki with an ambient RPS is kind of hard. I actually like MediaWiki because it's very practical and extensible, but on the other hand I know in my heart that it is a messy piece of software that certainly could make better use of the machine it's running on.
The cost of running a wiki has gone down over time in my experience though, especially if you are running things as slim as possible. A modest Digital Ocean machine can handle a fair bit of traffic, and if you wanted to scale up you'd get quite a boost by going to one of the lower end dedicated boxes like one of the OVHcloud Rise SKUs.
If anyone is trying to do this I have a Digital Ocean pro-tip. Don't use the Premium Intel boxes. The Premium AMD boxes are significantly faster for the money.
One trap I also fell into was I thought it might be a good idea to throw this on a hyperscaler, you know, Google Cloud or something. While it does simplify operations, that'll definitely get you right into the "thousands of dollars per month" territory without even having that much traffic...
At one point in history I actually felt like Wikia/Fandom was a good offering, because they could handle all of this for you. It didn't start out as a bad deal...
A lot of things should be solved by having (micro)caching in front of your wiki. Almost all non-logged in requests shouldn't even be hitting PHP at all.
Have any of Intels server offerings been "premium" since epyc hit the scene?
I just assumed they were still there based on momentum.
This is so true.
I adopted mediawiki to run a knowledge base for my organization at Microsoft ( https://microsoft.github.io/code-with-engineering-playbook/I... ).
As I was exploring self-host options that would scale to our org size, it turned out there was already an internal team running a company wide multi-tenant mediawiki PLATFORM.
So I hit them up and a week later we had a custom instance and were off to the races.
Almost all the work that team did was making mediawiki hyper efficient with caching and cache gen, along with a lot of plumbing to have shared infra (AD auth, semitrusted code repos, etc) thst still allowed all of us “customers” to implement whatever whacky extensions and templates we needed.
I still hope that one day Microsoft will acknowledge that they use Mediawiki internally (and to great effect) and open-source the whole stack, or at least offer it as a hosted platform.
I tried setting up a production instance af my next employer - and we ended up using confluence , it was like going back to the dark ages. But I couldn’t make any reasonable financial argument against it - it would have taken a a huge lift to get a vanilla MW instance integrated into the enterprise IT environment.