logoalt Hacker News

righthandyesterday at 6:48 PM1 replyview on HN

Facebook provides a data export service which gives you a zip file with a web version of all your content. I’m not sure what the difference is then between that and a Github hosted repository of all your content as a webpage.


Replies

sagaroyesterday at 7:16 PM

The main difference is the data structure and the intent of the export. Facebook's tool is built for data compliance and local offline viewing, not web portability. If you open that Facebook zip file, the HTML version is just a massive dump of proprietary markup. To actually migrate those posts to a new blog, you'd have to write a custom scraper just to extract your own text from their messy div tags. If you use their JSON export, you still have to write a custom script to parse their specific schema and remap all the hardcoded local image paths so they work on a live server. With a Github Pages repo, your content is already sitting there as raw, standardized Markdown. You can just take that folder of .md files, drop it into Hugo, 11ty, or any other static site generator, and it just works. No scraping or data-wrangling required.