Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder s...

abstract257 • today at 1:20 PM • 1 reply • view on HN

Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...

Replies

krunck • today at 2:08 PM

I had to extract the image from a PDF for it to work. Then run it on each page image extracted.

alt Hacker News

Replies