British authors ‘absolutely sick’ to discover books on ‘shadow library’ allegedly used by Meta to train AI

British authors have told Sky News they felt “absolutely sick” to see their book titles appear in a “shadow library” allegedly used by tech giant Meta to help develop artificial intelligence software.

“It’s my whole life,” said one best-selling novelist. “The thought somebody in Silicon Valley or wherever is taking that work to produce identikit fake AI versions… it’s so upsetting.”

The tool to search the LibGen database was published by The Atlantic last week after court documents filed as part of a lawsuit by US comedian Sarah Silverman and other authors against Meta, which owns Facebook, Instagram and WhatsApp and has a current market value of more than £1trn, were made public earlier this year.

Meta is accused of breaching copyright laws by using LibGen – a prominent so-called “shadow library”, operated anonymously, that allegedly contains millions of pirated copies of books, journal articles and other materials – to develop its AI software. Meta has denied the claim and argues the case should be thrown out.

In a legal document filed earlier this week, the tech company said it did not violate copyright law by downloading books from some parts of LibGen to train its flagship AI system Llama 3, saying it made “fair use” of the material, and that Llama 3 does not “replicate” authors’ works.

In earlier court documents, lawyers for Silverman and the other authors alleged internal communications showed Meta chief executive Mark Zuckerberg “approved” use of the LibGen dataset despite concerns from some workers.

Image: Author Rowan Coleman has written dozens of novels. Pic: Carolyn Mendelsohn

The Society of Authors (SoA) trade union has described Meta’s alleged behaviour as “appalling” and says the company “needs to compensate the rightsholders of all the works it has been exploiting”.

“It’s every single book I have ever written,” says novelist Rowan Coleman, who has had about 40 books published since her first in 2002, including the Sunday Times bestseller The Memory Book in 2014, and The Bronte Mysteries series under a pen name.

“I felt absolutely sick… I have no way of knowing how much revenue that has cost me. Like most writers, I struggle to pay the bills. I have three jobs, I have children to support and a mortgage to pay. And there are tech billionaires who are profiting from my work and the work of countless other authors as well. How can that be right?”

Meta, Coleman says, allegedly decided to obtain “what they needed cheaply and quickly”.

But financial compensation aside, she says there is a bigger issue. “It’s a threat to this profession even being able to continue to exist. We are, I think, at genuine risk of not having any books for people to actually pirate – at least not any written by humans.”

Image: Owen Cooper and Stephen Graham in Adolescence. Pic: Netflix

Coleman highlights the recent Netflix drama Adolescence, co-written by and starring Stephen Graham, which has been discussed everywhere from US talk shows to UK parliament. “We wouldn’t have that if it wasn’t for writers sitting down and working and grafting for hours.

While JK Rowling, Stephen King and James Patterson may be worth millions, a survey in 2022 found that authors in the UK earned an average median income of about £7,000.

Hannah Doyle, a romcom novelist who is about to publish her fifth novel, The Spa Break, in May, says two of her previous works appear in the LibGen search.

Like Coleman, she has other jobs to supplement her author earnings. Each book takes about a year to complete, she says.

‘It’s David and Goliath’

Image: Author Hannah Doyle is about to publish her fifth novel

“We’re kind of the little people, it’s like David and Goliath,” she says. “How do we stand up for our rights when we’re facing these tech giants worth trillions of pounds?

“This isn’t right, because it’s theft, ultimately. They’re [allegedly] stealing our work and they’re using it to better their AI systems. What’s going to happen to our careers as a result of that?”

Doyle says the situation might be different had authors been approached and offered remuneration.

“I think AI has so many benefits in certain fields,” she says. “For medical research, for example, it’s got the potential to be incredibly useful. What needs to happen is we really need to give it some boundaries before it totally takes over.”

Award-winning writer Damian Barr, whose books also appear to be featured in the database, shared a post on Instagram, writing: “Readers and viewers – because so much TV and film and theatre starts with a book – are being subjected to BILGE generated by machines… creatively and culturally and financially, AI is robbing us all.”

Image: Richard Osman. Pic: Carsten Koall/picture-alliance/dpa/AP Images

TV presenter and author Richard Osman, who has had huge success with his Thursday Murder Club series, wrote on X: “Copyright law is not complicated at all. If you want to use an author’s work you need to ask for permission. If you use it without permission you’re breaking the law. It’s so simple. It’ll be incredibly difficult for us, and for other affected industries, to take on Meta, but we’ll have a good go!”

In his article, Atlantic writer Alex Reisner, who created the LibGen search tool, gave the caveats that it is “impossible” to know exactly which parts of LibGen Meta has used and which parts it hasn’t, and the database is “constantly growing”.

His snapshot was created in January 2025, he says, more than a year after the lawsuit says it was accessed by the tech giant, so some titles that appear now would not have been available to download at that point.

The SoA is urging authors in the UK to write to Meta, as well as to their local MPs.

“Rather than ask permission and pay for these copyright-protected materials, AI companies are knowingly choosing to steal them in the race to dominate the market,” chief executive Anna Ganley said in a statement.

“This is shocking behaviour by big tech that is currently being enabled by governments who are not intervening to strengthen and uphold current copyright protections.”

A Meta spokesperson told Sky News in a statement that the company “has developed transformational GenAI open source LLMs that are powering incredible innovation, productivity, and creativity for individuals and companies”.

The statement continued: “Fair use of copyrighted materials is vital to this. We disagree with plaintiffs’ assertions, and the full record tells a different story. We will continue to vigorously defend ourselves and to protect the development of GenAI for the benefit of all.”

The US lawsuit

Image: Comedian Sarah Silverman is one of the authors suing Meta in the US. Pic: AP

Authors including comedian Silverman, Richard Kadrey and Ta-Nehisi Coates filed their class-action lawsuit against Meta in California in 2023.

They have accused the tech firm of illegally downloading digital copies of their books and using them – without their consent or offering compensation – to train AI.

The controversy surrounding LibGen is part of a wider debate about AI and copyright law. In the US, the Authors Guild says legal action is under way against other AI companies for allegedly using pirated books, as well as Meta.

The organisation has advised authors that if their books have been used by Meta, they are automatically included in the Kadrey vs Meta class action, the lawsuit involving Silverman and other authors, “without needing to take any immediate action”.

Read more from Sky News:
BAFTA TV Awards nominations revealed
Oasis fans may have been misled, watchdog says

Separately in 2023, the Authors Guild and 17 authors filed a class-action suit against OpenAI in New York for alleged copyright infringement. The named plaintiffs include John Grisham, George RR Martin and Jodi Picoult.

The issue was also one of the driving forces behind the strikes in Hollywood in 2023. But not everyone in the creative industries is against it.

Last year, publisher Harper Collins reached an agreement with an unnamed technology company to allow “limited use of select non-fiction backlist titles” for training AI models.

And in 2023, award-winning crime author Ajay Chowdhury told Sky News he was embracing the technology.

AI law in the UK – what is happening?

A consultation on AI copyright law in the UK ended in February. Under the plans, an exemption to copyright would be created for training AI, so tech firms would not need a licence to use copyrighted material – and creators would need to opt out to prevent their work from being used.

A government spokesperson said at the time that the UK’s current regime for copyright and AI was “holding back the creative industries, media and AI sector from realising their full potential – and that cannot continue”.

No changes will be made “until we are absolutely confident we have a practical plan that delivers each of our objectives, including increased control for rights holders to help them easily license their content, enabling lawful access to material to train world-leading AI models in the UK, and building greater transparency over material being used”, the spokesperson said.

But plenty of authors and others in the creative industries are not convinced.

“It just leaves the door open for so much exploitation of people’s rights, people’s data and their work,” says Coleman. “I would really urge the government to think again about this and to protect what is a jewel in the crown of British cultural identity – to do the right thing.”