Has anyone explored a different angle — like mapping out the 1,000 most frequently mentioned or cited books (across HN, Substack, Twitter, etc.), then turning their raw content into clean, structured data optimized for LLMs? Imagine curating these into thematic shelves — say, “Bill Gates’ Bookshelf” or “HN Canon” — and building an indie portal where anyone can semantically search across these high-signal texts. Kind of like an AI-searchable personal library of the internet’s favorite books.
Well, there's this: https://hacker-recommended-books.vercel.app/category/0/all-t...