logoalt Hacker News

flyinglizardlast Thursday at 10:12 PM1 replyview on HN

Gmail has 1.8b active users, each with thousands of emails in their inbox. The number of emails they can train of is probably in the trillions.


Replies

brokencodelast Thursday at 10:29 PM

Email seems like not only a pretty terrible training data set, since most of it is marketing spam with dubious value, but also an invasion of privacy, since information could possibly leak about individuals via the model.

show 1 reply