logoalt Hacker News

mikert89last Thursday at 4:56 PM3 repliesview on HN

AI models will do all this natively


Replies

ritvikpandey21last Thursday at 5:28 PM

we disagree! we've found llms by themselves aren't enough and suffer from pretty big failure modes like hallucination and inferring text rather than pure transcription. we wrote a blog about this [1]. the right approach so far seems to be a hybrid workflow that uses very specific parts of the language model architecture.

[1] https://www.runpulse.com/blog/why-llms-suck-at-ocr

show 3 replies
throw03172019last Thursday at 6:17 PM

This is like saying AI models can generate images. But a hyper focused model or platform on image generation will do better (for now)