logoalt Hacker News

simianwordsyesterday at 9:02 PM4 repliesview on HN

> These companies downloaded and seeded copy righted material and then sold a product made from that data

but no company did this.


Replies

yjftsjthsd-hyesterday at 9:25 PM

I'm not a lawyer and I don't follow this area super closely, but it sure sounds like they did?

https://www.tomshardware.com/tech-industry/artificial-intell...

> Facebook parent-company Meta is currently fighting a class action lawsuit alleging copyright infringement and unfair competition, among others, with regards to how it trained LLaMA. According to an X (formerly Twitter) post by vx-underground, court records reveal that the social media company used pirated torrents to download 81.7TB of data from shadow libraries including Anna’s Archive, Z-Library, and LibGen. It then used this information to train its AI models.

> Aside from those messages, documents also revealed that the company took steps so that its infrastructure wasn’t used in these downloading and seeding operations so that the activity wouldn’t be traced back to Meta. The court documents say that this constitutes evidence of Meta’s unlawful activity, which seems like it’s taking deliberate steps to circumvent copyright laws.

show 1 reply
FrustratedMonkyyesterday at 9:05 PM

OpenAI did, and this is so uncontroversial, I'm surprised you are saying it didn't happen.

show 1 reply
TSiegeyesterday at 9:10 PM

OpenAI, Meta, and Anthropic all are known to have done this. It's even been exposed in company internal communications. Anthropic already settled their court case. You're an 11 month old account and I suspect you are some sort of bot or user meant to spread misinformation on the forum.

show 1 reply
totallygeekyyesterday at 9:04 PM

[flagged]

show 1 reply