logoalt Hacker News

jchw10/10/20241 replyview on HN

In my experience this hasn't been necessary yet on anything I've ran. I know WMF wikis run Varnish or something, but personally I'm trying to keep costs and complexity minimal. To that end, more caching isn't always desirable, because RAM is especially premium on low-end boxen. When tuned well, read-only requests on MediaWiki are not a huge problem. The real issue is actually just keeping the FPM worker pool from getting starved, but when it is starved, it's not because of read-only requests, but usually because of database contention preventing requests from finishing. (And to that end, enabling application-level caching usually will help a lot here, since it can save having to hit the DB at all.) PHP itself is plenty fast enough to serve a decent number of requests per second on a low end box. I won't put a number on it since it is obviously significantly workload-dependent but it would suffice to say that my concerns with optimizing PHP software usually tilt towards memory usage and database performance rather than the actual speed of PHP. (Which, in my experience, has also improved quite a lot just by virtue of PHP itself improving. I think the JIT work has great potential to push it further, too.)

The calculus on this probably changes dramatically as the RPS scales up, though. Not doing work will always be better than doing work in the long run. It's just that it's a memory/time trade-off and I wouldn't take it for granted that it always gives you the most cost-effective end result.


Replies

bawolff10/10/2024

Varnish caching really only helps if the majority of your traffic is logged out requests. Its the sort of thing that is really useful at a high scale but matters much less at a low scale.

Application level caching (memcached/redis/apcu) is super important even at a small scale.

Most of the time (unless complex extensions are involved or your wiki pages are very simple) mediawiki should be io-bound on converting wikitext -> html (which is why caching that process is important). Normally if db is healthy, db requests shouldn't be the bottle neck (unless you have extensions like smw or cargo installed)

show 1 reply