logoalt Hacker News

deepfriedbitstoday at 3:30 AM2 repliesview on HN

Could they have even trained the models 25 years ago? Wikipedia was nothing close to what it is today and I know folks here like to mourn the fall of the open web, but it's still orders of magnitude larger today than it was in 2001. YouTube, so many information stores that simply didn't exist then.


Replies

hattmalltoday at 4:18 AM

Maybe not 25,but IBM Watson beat humans at Jeopardy over 10 years ago. The technology has been there, the difference is the willingness to burn money on it in hopes of capturing exponential revenue from disrupting industries.

Obviously the costs have come down but if IBM felt like burning 100 Billion in 2012 I'm pretty sure they could have a similarly impressive chat bot. Just not sure how they would have ever recouped the revenue.

show 1 reply
com2kidtoday at 4:11 AM

The book archives are a big one as well, all the journals that have been published digitally throughout the 2000s, and all the newspapers.

Though with some types of models (specifically voice) it has been discovered that a smaller high quality dataset is better than a giant dataset filled with errors.