It is fascinating. Vision language models are unreasonably good compared to dedicated OCR and even t...

zmmmmm • yesterday at 10:42 PM • 2 replies • view on HN

It is fascinating. Vision language models are unreasonably good compared to dedicated OCR and even the language tasks to some extent.

My take is it fits into the general concept that generalist models have significant advantages because so much more latent structure maps across domains than we expect. People still talk about fine tuning dedicated models being effective but my personal experience is it's still always better to use a larger generalist model than a smaller fine tuned one.

Replies

kgeist • today at 12:14 AM

>People still talk about fine tuning dedicated models being effective

>it's still always better to use a larger generalist model than a smaller fine tuned one

Smaller fine-tuned models are still a good fit if they need to run on-premises cheaply and are already good enough. Isn't it their main use case?

➕ show 1 reply

jepj57 • today at 12:10 AM

Now apply that thinking to human-based neural nets...

alt Hacker News

Replies