logoalt Hacker News

clusterhackstoday at 1:45 AM1 replyview on HN

I was playing around with Qwen3-VL to parse PDFs - meaning, do some OCR data extraction from a reasonably well-formated PDF report. Failed miserably, although I was using the 30B-A3B model instead of the larger one.

I like the Qwen models and use them for other tasks successfully. It is so interesting how LLMs will do quite well in one situation and quite badly in another.


Replies

totetsutoday at 2:36 AM

The opus models seems pretty adept and extracting structured data from ocr https://www.ocrarena.ai/battle