• news-banner

    Expert Insights

Copyright and AI: Part 2 – Infringement by machine?

Earlier this year, we wrote the first part of our series looking at copyright and generative AI, Copyright and AI: Part 1 – Teaching the machine. As we discussed then, so that an AI tool (like ChatGPT or Stable Diffusion) can follow an instruction to do something, like write something or draw something, it needs to have been provided and taught with information that it can later use and draw on.

We focused on possible copyright infringing acts at the “input” stage if an AI is taught with unlicensed data that has perhaps been scraped from the World Wide Web. In addition, we looked at the possible exceptions to infringing acts that may be available in the UK now, but also the possible future expansion of the text and data mining (TDM) exception given that providing an “AI-friendly” regulatory environment while protecting the rights holders is a relatively recent but pressing demand facing most governments.

In this second article, we are looking at possible copyright infringing acts that occur at the “output” stage, infringements, we might colloquially say that are “by machine”, when a user uses generative AI to create a new work. Then we touch on what the UK Government’s next move may be, following the latest report and recommendations on the topic by the Culture, Media and Sport Committee1.

“Output” infringements

The situation we have in mind when considering output infringement is one where an AI tool has been used to create something, like an image or some text, under instruction by the user. 

Now, suppose that the output work resembles in some way a copyright work owned by a third party. What are the options for that third party? 

(It is worth us pointing out that we are not addressing here the question of whether the output work is itself a copyright work. The truth is that, under the UK law anyway, it might be or it might not be. That is a separate and certainly interesting question – perhaps worth a separate article in due course. Our focus here though is on the possible infringement of third party works.)

Rights of owner of copyright work

In the UK, the owner of a copyright work, that is an original literary, dramatic, musical or artistic work, has various exclusive rights in relation to that work including but not limited to the rights to copy it, issue copies of it to the public, communicate it to the public and adapt it. If others do these acts or authorise them to be done, without the consent of the copyright owner, they may be infringing copyright if they cannot rely on one of the statutory exceptions. This means that, in some circumstances, infringement can occur as soon as the work is created. 

For the purpose of this article, we will focus mainly on infringement by copying. In the context of outputs from generative AI tools, in the UK, an output work created using AI may infringe copyright if it copies or if, in other words, it reproduces another copyright work in a material form – which is how copying is defined in the Copyright, Designs and Patents Act 1988. 

In order for infringement to have taken place, we need two steps to be fulfilled: (1) there must be objective similarity between the output work and the whole or a substantial part of the original copyright work that is the author’s expression of their own intellectual creation; and (2) the output work must be derived from the original copyright work.

If the user explicitly instructs the AI to create a work and this new work closely resembles the copyright owner’s original work or a part of it, then establishing the first of the two criteria above, namely the requisite objective similarity between the two works, may be quite easy. What is likely to be much more difficult and in some cases, practically impossible, is for the copyright owner to demonstrate that the “output” work is derived from their original copyright work. 

How might a copyright owner establish their claim? And how might a user of AI satisfy themselves an output work does not infringe?

Perhaps the most clear-cut kind of situation is one where the original copyright work is not within the dataset on which the AI was taught. In such a case the output work cannot be derived from the original and it cannot directly infringe it. So, if the dataset is known, there could be an effective defence to an infringement claim. 

Even in more “opaque” situations, it will often be a major challenge for a copyright owner to sustain an infringement claim in respect of such a possibly infringing output. For one thing, at the outset, the copyright owner bringing the claim will generally not know what is in the dataset and, depending on the circumstances, finding out could be very challenging. 

Another practical difficulty is that there could be a number of works in the dataset all of which are similar to the original copyright work, but not all of which have been created by the original copyright owner. This would make it harder for the copyright owner to demonstrate that it is their work that has been infringed.

By contrast, it will be easier for the copyright owner to establish their claim in the situation where they can demonstrate that their original copyright work was within the dataset that taught the AI and easier still, if there are no similar works to this within the dataset or if the output work is extremely similar to their original copyright work. One can imagine cases where arguments are put forward that an output resembles a copyright work so closely that the only reasonable explanation is that copying must have taken place.  

Along the same lines, the claim will be much easier if the output work contains identical or very similar distinctive features of the original, like a watermark or signature (in the case of images), or perhaps a line of distinctive code in a software context. Such clear markers will, however, be the exception rather than the rule. 

So, because often people will not know what data has been used to teach an AI tool, or because they are likely to find it difficult to effectively obtain and then search that data, both users of the tool and those who consider that their copyright may have been infringed will face challenges. 

On one hand, it is very difficult for a user generating a new output work using an AI system to be entirely sure that the new work does not infringe copyright. Unless they are confident that an exception will apply covering their creation and/or use of the work or they know exactly what was within the teaching dataset, there is always a possible infringement risk. 

On the other hand, it can also be very difficult for the copyright owner to establish their claim in ordinary circumstances and this may have the effect of reducing the risk to the user of facing an infringement action. These challenges, which would likely lead to increased costs for both parties in litigation, help to explain why it has been predominantly larger entities or groups bringing claims or class actions and why the claims are against well-resourced AI technology owners rather than users. The chances of a substantial recovery, and of the defendant being able to pay substantial amounts, are much higher. 

A final note on infringement by the outputs of generative AI tools: use of the created work could, in some circumstances, result in the infringement of various other intellectual property rights, not only copyright. For instance, if the work created was similar to a registered trade mark or a design protected by registered or unregistered design right, then some uses of the work may infringe these other rights.

Approach of generative AI businesses to infringement

In spite of the evidential challenges faced by potential claimants, there are clear risks for users who are using AI solutions to create new outputs. Help is at hand, however, and some businesses offering such AI solutions are looking to reassure their users. 

Microsoft, for example, are very clear on their position regarding their AI Copilot product, offering their Copilot Copyright Commitment: 

"As customers ask whether they can use Microsoft's Copilot services and the output they generate without worrying about copyright claims, we are providing a straightforward answer: yes, you can, and if you are challenged on copyright grounds, we will assume responsibility for the potential legal risks involved."2

Similarly, Google offer indemnities to users of some of their AI-powered services in respect of losses they may incur for IP infringement. “To our knowledge,” they have said, “Google is the first in the industry to offer a comprehensive, two-pronged approach to indemnity”3 with cover that provides protection in respect of both infringement in terms of the outputs from their Google Cloud and Workplace platforms but also in terms of the inputs, i.e. copyright claims in relation to the training of their systems.  

Needless to say, terms and conditions will apply to these respective indemnities and reassurances and it is likely, for example, that deliberate attempts to infringe or acts that are perhaps reckless as to whether third party rights are infringed would be excluded. 

Input infringement update

On the topic of “inputs”, as we touched upon in our first article, in their January 2023 report the House of Lords’ Communications and Digital Committee criticised the UK Intellectual Property Office’s June 2022 suggestion that the text and data mining (TDM) copyright exception should be broadened to allow for the mining of all copyright works with no opt-out option for the rights holders. In current form, that exception only allows TDM for non-commercial research by those who already have lawful access to the copyright works. 

Following the Government’s confirmation that it will not expand the exception, on 30 August 2023, the Culture, Media and Sport Committee published a report1 endorsing that decision. The Committee remarks in the report that it considers the current exception to be “an appropriate balance between innovation and creator rights”. Above, we looked at some of the inherent difficulties around bringing a claim particularly if you are copyright owner with limited resources and it is unsurprising that in its report the Committee has also called on the Government to “consider how creatives can ensure transparency and, if necessary, recourse and redress if they suspect AI developers are wrongfully using their works in AI development”

Clearly, the removal of the TDM exception is positive for copyright owners. The UK’s Intellectual Property Office is at the time of writing (November 2023) working with users and rights holders to develop an AI and IP code of practice. One of its aims is to make licences for data mining more available and it will seek to help to surmount some of the barriers AI providers face while also protecting the interests of the rights holders. It will be a voluntary code of practice but the Government will legislate if it proves to be ultimately necessary. We are intrigued to see how the balance plays out between the concerns of the creatives on the one hand and the desire to encourage AI innovation and development on the other.


1Connected tech: AI and creative technology
2 Microsoft announces new Copilot Copyright Commitment for Customers
3 Google to defend generative AI users from copyright claims

Our thinking

  • Advocacy: Lessons from The Mandela Brief for International Arbitration Today

    Jue Jun Lu

    Events

  • LIIARC Tax Investigations Uncovered: Legal Tactics, Courtroom Trends & Strategic Remedies

    Caroline Greenwell

    Events

  • Updates from the Building Safety Regulator - Unblocking the Gateways for Higher Risk Buildings

    Tegan Johnson

    Quick Reads

  • Insights from the latest ABA Technology in M&A Subcommittee meeting – where are recent innovations taking us?

    Daniel Rosenberg

    Quick Reads

  • World Intellectual Property Review quotes Dewdney William Drew on the Getty Images vs Stability AI decision

    Dewdney William Drew

    In the Press

  • The 1975 Act Turns Fifty: Why Reform was Needed and What Changed

    Tamasin Perkins

    Insights

  • ECCTA for Charities: Maintaining Registers

    Giverny McAndry

    Insights

  • ECCTA 2023 - Failure to prevent fraud offence- what charities need to know and do

    Penelope Byatt

    Insights

  • What do agricultural landlords and workers need to know about the Renters’ Rights Act?

    Emma Preece

    Insights

  • An introduction to Economic Crime and Corporate Transparency Act 2023 for charities: key changes from 18 November 2025

    Liz Gifford

    Insights

  • Succession Stumbling Blocks: Lessons from Thomas v Countryside Solutions Ltd

    Maddie Dunn

    Quick Reads

  • Morning Star UK quotes Julia Cox on the impact of potential inheritance tax rises in the UK Autumn Budget

    Julia Cox

    In the Press

  • What legal developments can the Living Sector expect as we approach the end of 2025 and look ahead to 2026?

    Mark White

    Insights

  • CDR Magazine quotes Jue Jun Lu on China’s newly revised arbitration law

    Jue Jun Lu

    In the Press

  • Andrew Ross and Laura Bushaway write for Property Week on a Supreme Court judgment relating to nuisance

    Andrew Ross

    In the Press

  • Good Divorce Week 2025: Believe it or not, there is a better way

    Emily Borrowdale

    Quick Reads

  • Charles Russell Speechlys further bolsters its Corporate team with the appointment of Ed Morgan

    David Collins

    News

  • Autumn Budget 2025: Sifting the Rumours on Tax Rises and Reforms

    Charlotte Inglis

    Quick Reads

  • Adjudication under the Construction Act – a case on the residential occupier exception and contesting the validity of a payless notice

    Tegan Johnson

    Insights

  • VAT on Developer’s Biodiversity net gain (BNG) costs

    Elizabeth Hughes

    Insights

Back to top