logoalt Hacker News

superjanlast Friday at 8:03 PM0 repliesview on HN

Well, it is not pretty to see how the sausage gets made, but extracting formatted text from docx is absolutely doable, no PhD involved. Source: I have done it as a little sidequest because it was useful to audit a set of word documents.