logoalt Hacker News

Legend244011/05/20253 repliesview on HN

Right, but the non-deep learning OCR methods also do that. And they have a much much lower overall accuracy.

There’s a reason deep learning took over computer vision.


Replies

vincenthwt11/05/2025

You're absolutely right, deep learning OCR often delivers better results for complex tasks like handwriting or noisy text. It uses advanced models like CNNs or CRNNs to learn patterns from large datasets, making it highly versatile in challenging scenarios.

However, if I can’t understand the system, how can I debug it if there are any issues? Part of an engineer's job is to understand the system they’re working with, and deep learning models often act as a "black box," which makes this difficult.

Debugging issues in these systems can be a major challenge. It often requires specialized tools like saliency maps or attention visualizations, analyzing training data for problems, and sometimes retraining the entire model. This process is not only time-consuming but also may not guarantee clear answers.

show 1 reply
shash11/05/2025

OCR is one of those places where you can just skip algorithm discovery and go straight to deep learning. But there are precious few of those kinds of places actually.

do_not_redeem11/05/2025

GP is talking about thresholding and thresholding is used in more than just OCR. Thresholding algorithms do not hallucinate numbers.