logoalt Hacker News

NoMoreNicksLeftyesterday at 3:03 PM3 repliesview on HN

Could have sworn they did this years ago. I even have the first 80 years or whatever on DVD in the closet.


Replies

throwup238yesterday at 7:49 PM

Normally when laymen say "digitized" they mean one of two things: scanned images in a PDF or fully transcribed (and possible formatted) text extracted from the scan. The Complete New Yorker you're thinking of was mostly the former, with a bit of indexing (table of contents pointing to the PDFs if I remember correctly).

This latest digitization project does the latter, transcribing the text into their existing content management system and as far as I can tell, preserving much of the formatting. This comes with full text search, allows cross linking between articles, and all that good stuff.

I suspect that since they include an LLM summary and started this digitization project in early 2024, this was enabled by LLMs.

smelendezyesterday at 4:31 PM

If I’m reading this correctly, they now have all their historic articles loaded into their CMS. I think they previously just had a system where you could page (and maybe search?) through scans of old issues, which is also cool but not as versatile.

ghaffyesterday at 3:31 PM

When a lot of content was being put out on CD/DVD, a number of publications did but they are not straightforwardly accessible these days because they're usually on an old version of Windows. (Yes, if you want to make a project of it, you can probably get into them but has never been worth it for me.)

show 5 replies