logoalt Hacker News

pzolast Friday at 8:32 PM5 repliesview on HN

there has been so many open source OCR in the last 3 months that would be good to compare to those especially when some are not even 1B params and can be run on edge devices.

- paddleOCR-VL

- olmOCR-2

- chandra

- dots.ocr

I kind of miss there is not many leaderboard sections or arena for OCR and CV and providers hosting those. Neglected on both Artificial Analysis and OpenRouter.


Replies

culilast Friday at 9:40 PM

Someone posted a project here about a month ago where they compare models in head-to-head matchups similar to llmarena

https://www.ocrarena.ai/leaderboard

Hasn't been updated for Mistral but so far gemeni seems to top the leaderboard.

show 3 replies
andailast Friday at 10:05 PM

I spent like three hours trying to get one of these running and then gave up. I think the paddleOCR one.

It took an hour and a half to install 12 gigabytes of pytorch dependencies that can't even run on my device, and then it told me it had some sort of versioning conflict. (I think I was supposed to use UV, but I had run out of steam by that point.)

Maybe I should have asked Claude to install it for me. I gave Claude root on a $3 VPS, and it seems to enjoy the sysadmin stuff a lot more than I do...

Incidentally I had a similar experience installing open web UI... It installed 12 GB of pytorch crap.. I rage quit and deleted the whole thing, and replicated the functionality I actually needed in 100 lines of HTML.... Too bad I can't do that with OCR ;)

show 1 reply
pzolast Friday at 8:56 PM

what I like in MistralOCR is that they have simple pricing $1/1k pages and API hosted on their servers. With other OCR is hard to compare pricing because are token based and you don't know how many tokens is the image unless you run your own test.

E.g. with Gemini 3.0 flash you might seem that model pricing increased only slightly comparing to Gemini 2.5 flash until you test it and will see that what used to be 258 per 384x384 input tokens now is around 3x more.

show 2 replies
jammolast Friday at 10:49 PM

[dead]