Sunday, January 12, 2025
HomeNews ReportsMark Zuckerberg permitted Meta to use pirated books to train its AI model, say...

Mark Zuckerberg permitted Meta to use pirated books to train its AI model, say authors accusing Meta of copyright infringement

Meta employees reportedly used Library Genesis (LibGen), a well-known database of pirated books, and some of them expressed concern over torrenting on company laptops.

Meta founder and CEO Mark Zuckerberg reportedly allowed his company to use pirated copies of copyrighted books to train its artificial intelligence systems. Meta is currently facing lawsuits for copyright infringement from various authors and comedian Sarah Silverman who accused the company of misusing their works to train its large language model Llama.

As per reports, documents submitted in California federal court revealed that internal files of Meta show that the company was aware of books being pirated using torrents. Meta employees reportedly used Library Genesis (LibGen), a well-known database of pirated books. The complainants had sought permission from the court to submit an updated complaint on Wednesday.

The company’s internal communications showed employees expressing concern over downloading books from LibGen. One of the engineers reportedly expressed reservations about ‘torrenting from a corporate laptop’. It is the allegation of the authors that Zuckerberg approved the use of LibGen despite the concerns of its executive team.

Denying the allegations of copyright infringement, Meta took the defence of the ‘fair use’ doctrine. It argued that the plaintiffs were aware of the use of LibGen by Meta since July 2024 and had enough time to use this information in their complaints.

Meta’s redaction attempts were dismissed by Judge Vince Chhabria of the US District Court for the Northern District of California as ‘preposterous’ and intended to avoid negative publicity instead of protecting business interests. The judge warned Meta against making any requests for redaction in future stating that any unreasonable broad selling requests would result in all material being unsealed.

While Meta had earlier admitted to using Books3, a dataset of around 196,000 books for training its Llama language model, it did not publicly disclose the direct use of LibGen data.

Join OpIndia's official WhatsApp channel

  Support Us  

Whether NDTV or 'The Wire', they never have to worry about funds. In name of saving democracy, they get money from various sources. We need your support to fight them. Please contribute whatever you can afford

OpIndia Staff
OpIndia Staffhttps://www.opindia.com
Staff reporter at OpIndia

Related Articles

Trending now

- Advertisement -