logoalt Hacker News

aspenmartinyesterday at 4:48 PM2 repliesview on HN

You can do what frontier labs do today which is to properly license things that are copyrighted and use open source web crawls for things that don’t have copyright issues. You can then also commission new datasets (volume needed goes down when quality is high).

The European regulations are the thing that will kneecap anything meaningful coming out of Europe. Mind blowing to me that this is worth the tradeoff since Europe will be beholden to other frontier labs be it China or the US, so regulations accomplishing very little if anything on impacting actual AI development and losing vast amounts of leverage in the process.


Replies

michaeltyesterday at 7:59 PM

> You can do what frontier labs do today which is to properly license things that are copyrighted and use open source web crawls for things that don’t have copyright issues. You can then also commission new datasets (volume needed goes down when quality is high).

It cost Anthropic $1.5 billion for training on libgen's 480k pirated ebooks.

Investors will cough up that money if you're already clearly a frontier lab with a model people are paying a lot of money for.

Tough to get that much cash without anything to show.

show 2 replies
rootlocusyesterday at 4:55 PM

Regulations aside, Europe is extremely divided. There's constant resistance from individual states, disputes and far right extremism gaining traction. At this point, it seems like EU can barely agree to make any decision.

show 1 reply