Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”.
Should have probably tried a more OCR specific model
Opus is very good at OCR. Way better than the small 1-4B VLMs. If Opus failed, most likely those smaller models will fail as well.
I do not believe this story.
Opus 4.8 scanned hundreds of PDFs for me recently with the worst handwriting imaginable. 100% successful, other than one record where even I could not figure out what was written.
> All I needed to do was extract dates from receipts
Was this... not basically a solved problem like 30 years ago? I'm pretty sure the shareware OCR tool that came with a black and white scanner I had at one point would do better than 20% wrong.