While (a) may be true, (b) is definitely true: if there's even one model with 340 million (or f...

ben_w • today at 10:17 AM • 0 replies • view on HN

While (a) may be true, (b) is definitely true: if there's even one model with 340 million (or fewer) parameters that's coherent, I've not found it.

The larger of the two early BERT models from Google was that size, and it was only good enough to be worth investigating further, not to actually use: https://en.wikipedia.org/wiki/BERT_(language_model)

alt Hacker News