Re the OCR, I'm currently running allenai/olmocr-2-7b against all the PDFs with text in th...

embedding-shape • today at 3:49 PM • 2 replies • view on HN

Re the OCR, I'm currently running allenai/olmocr-2-7b against all the PDFs with text in them, comparing with the OCR DOJ provided, and a lot it doesn't match, and surprisingly olmocr-2-7b is quite good at this. However, after extracing the pages from the PDFs, I'm currently sitting on ~500K images to OCR, so this is currently taking quite a while to run through.

Replies

originalvichy • today at 3:57 PM

Did you take any steps to decrease the dimension size of images, if this increases the performance? I have not tried this as I have not peformed an OCR task like this with an LLM. I would be interested to know at what size the vlm cannot make out the details in text reliably.

➕ show 1 reply

helterskelter • today at 3:59 PM

[flagged]

➕ show 1 reply

alt Hacker News

Replies