Yes, the huge repository of raw materials is likely the hardest part. You can try crowdsourced colle...

yorwba • yesterday at 9:46 AM • 1 reply • view on HN

Yes, the huge repository of raw materials is likely the hardest part. You can try crowdsourced collections ( https://tatoeba.org , https://datacollective.mozillafoundation.org/datasets?q=comm... , https://opus.nlpl.eu/OpenSubtitles/corpus/version/OpenSubtit... ) but you'll quickly run into data quality issues. My personal solution is to do manual data curation on the fly, but I think an app that occasionally throws up garbage and asks its users to pick out the good parts is unlikely to get popular.

amelius • yesterday at 10:35 AM

Maybe the free version of the app could do the collaborative filtering part. And in the paid version you'd get the high quality content.

➕ show 1 reply

alt Hacker News