A group of authors have filed a lawsuit against Microsoft, alleging that the company used their books without permission to train its AI model. The lawsuit, lodged in federal court in New York, claims Microsoft relied on “Book3”, a dataset of unauthorised digital books, to train their AI model’s ability to generate text-based responses to user prompts.

Allegations against Microsoft Authors, including Kai Bird, Jia Tolentino and Daniel Okrent, accuse Microsoft of copyright infringement, arguing that the company’s AI model is trained using stolen intellectual property.

According to the complaint, the Megatron model learned the syntax, style and themes of the work by these authors. Also, the AI model is capable of mimicking authors’ styles and voices.

The authors are seeking statutory damages of up to $150,000 per infringed work and a court order to stop Microsoft from using their material in the future.

The lawsuit also points out that Microsoft’s use of the “Books3” dataset is a collection of nearly 200,000 pirated books. It was a deliberate move to bypass licensing fees and agreements with authors and publishers.

The authors argue that this not only violates their copyrights but also encourages the use of illegal digital libraries. They claim Microsoft’s Megatron AI model, trained on this unauthorised material, can create derivative works without permission.

This lawsuit is a part of a wave of complaints from authors and publishers against major tech companies like Meta, Anthropic and OpenAI. The complaint against Microsoft was filed just one day after a decision in California, where a judge ruled that the use of copyrighted material by Anthropic could be considered “fair use” under U.S. copyright law.

But the case against Microsoft includes that it allegedly used the authors’ work that was obtained illegally. And as of now, there is no public comment from Microsoft regarding this lawsuit. Attorneys also declined to comment on this ongoing lawsuit.