What do you think of the DeepSeek OCR approach where they say that vision tokens might better compre...

httpteapot • yesterday at 2:22 PM • 1 reply • view on HN

What do you think of the DeepSeek OCR approach where they say that vision tokens might better compress a document than its pure text representation?

https://news.ycombinator.com/item?id=45640594

I've spent some time feeding llm with scrapped web pages and I've found that retaining some style information (text size, visibility, decoration image content) is non trivial.

Replies

fbouvier • yesterday at 5:43 PM

Keeping some kind of style information is definitely important to understand the semantics of the webpage.

alt Hacker News

Replies