logoalt Hacker News

nickpsecuritytoday at 1:27 AM0 repliesview on HN

Labs were also competing to train BERT's for $20 or less. People still use them a lot, too.

https://www.databricks.com/blog/mosaicbert

I'll add they should do a number of small, training runs with different architectures and data mixes. That proves generalization.