Very strong statement on the title, given the following limitation: > Generation tasks. Method ...

utopcell • today at 3:29 AM • 1 reply • view on HN

Very strong statement on the title, given the following limitation:

> Generation tasks. Method applies to classification only. Preliminary decoder experiments show perplexity increases.

Replies

Yeah, burying this on page 8 is a bit suspect imo (the eval datasets are listed on page 3, so if you were familiar with them you would have a hint then).

The distillation of a student that predicts "anchor layers" and then acts as a backbone for classification is perfectly cool on its own; no need to stretch the title/abstract so much.

➕ show 1 reply

alt Hacker News

Replies