logoalt Hacker News

ekelsentoday at 6:02 AM1 replyview on HN

Jeff Dean has a paper in 2007 that has proto scaling law plots for ngram language models.

https://aclanthology.org/anthology-files/anthology-files/pdf...


Replies

beyonddreamtoday at 6:28 AM

Nice find! The final paragraph of the Conclusion is amazingly prescient!

"Significantly, we found that translation quality as indicated by BLEU score continues to improve with increasing language model size, at even the largest sizes considered. This finding underscores the value of being able to train and apply very large language models, and suggests that further performance gains may be had by pursuing this direction further."