logoalt Hacker News

datsci_est_2015yesterday at 9:24 AM1 replyview on HN

How a model is trained is different than how a model is constructed. A model’s construction defines its fundamental limitations, e.g. a linear regressor will never be able to provide meaningful inference on exponential data. Depending on how you train it, though, you can get such a model to provide acceptable results in some scenarios.

Mixing the two (training and construction) is rhetorically convenient (anthropomorphization), but holds us back in critically assessing a model’s capabilities.


Replies

hackinthebochsyesterday at 10:30 AM

Linear regression has well characterized mathematical properties. But we don't know the computational limits of stacked transformers. And so declaring what LLMs can't do is wildly premature.

show 1 reply