logoalt Hacker News

mvac06/16/20252 repliesview on HN

How does it compare to Datalab/Marker https://github.com/datalab-to/marker ? We evaluated many PDF->MD converters and this one performed the best, though it is not perfect.


Replies

nxobject06/16/2025

As anecdotal evidence, it serves my complex-enough purposes very well - mathematics and code interspersed together. One of my "litmus test" papers is this old paper on a Fortran inverse-Laplace transform algorithm [1] that intersperses inline and display equations, and monospace code blocks, while requiring OCR from scratch, and very few models currently do a satisfactory job, i.e. in the following page transcribed by Marker,

https://imgur.com/a/Q7UYIfW

the inline $\sigma_0$ is mangled as "<sup>s</sup> 0", and $f(t)$ is mangled as "f~~t*!". The current model gets them both correct.

show 1 reply
wittjeff06/16/2025

I am just getting started with my own cross-comparison, would appreciate your list of considered candidates if you have it handy.