logoalt Hacker News

swiftcodertoday at 9:34 AM2 repliesview on HN

The "essentially static hosting" isn't the cost centre (although with 5 million MAU, it's nothing to sneeze at). The real costs are on the input side - they have an ingestion pipeline that ensures standardised paper formatting and so on, plus at least some degree of human review.


Replies

bonoboTPtoday at 9:47 AM

Do you mean that the CPU compute cost of turning latex into pdf/HTML is the main cost?

show 1 reply
lou1306today at 9:52 AM

The PDF formatting is all but standardised. They ingest LaTeX sources, which is formatted according to the authors' whims (most likely, according to whatever journal or conference they just submitted the manuscript to). I'll concede that the (relatively novel) HTML formatter gives paper a more uniform appearance. They also integrate a bunch of external services for e.g., citation metrics and cross-references. Still hard to justify such a high cost to operate, but eh.

Also, the "human review" is a simple moderation process [1]. It usually does not dig into the submission's scientific merits.

[1] https://info.arxiv.org/help/moderation/index.html