logoalt Hacker News

VHRangeryesterday at 12:59 AM1 replyview on HN

Encoder/decoder is much, much more efficient for finetuning and inference than decoder-only models.

Historically T5 are good when you finetune them for task specific models (translation, summarization, etc).


Replies

sigmoid10yesterday at 8:15 AM

I have actually worked on encoder-decoder models. The issue is, finetuning itself is becoming historic. At least for text processing. If you spend a ton of effort today to finetune on a particular task, chances are you would have reached the same performance using a frontier LLM with the right context in the prompt. And if a big model can do it today, in 12 months there will be a super cheap and efficient model that can do it as well. For vision you can still beat them, but only with huge effort the gap is shortening constantly. And T5 is not even multimodal. I don't think these will change the landscape in any meaningful way.

show 1 reply