logoalt Hacker News

Insanitytoday at 3:14 PM3 repliesview on HN

Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”.

Should have probably tried a more OCR specific model


Replies

rsynnotttoday at 4:19 PM

> All I needed to do was extract dates from receipts

Was this... not basically a solved problem like 30 years ago? I'm pretty sure the shareware OCR tool that came with a black and white scanner I had at one point would do better than 20% wrong.

nik736today at 3:15 PM

Opus is very good at OCR. Way better than the small 1-4B VLMs. If Opus failed, most likely those smaller models will fail as well.

show 1 reply
bpodgurskytoday at 3:17 PM

I do not believe this story.

Opus 4.8 scanned hundreds of PDFs for me recently with the worst handwriting imaginable. 100% successful, other than one record where even I could not figure out what was written.

show 2 replies