Getty v Stability AI: A 'tantalising glance' of what’s to come for AI firms and creators
Yesterday, 14 January 2025, Mrs Justice Joanna Smith DBE handed down a reserved judgment in Getty Images (US) Inc and Ors v Stability AI Ltd [2025] EWHC 38 (Ch). This is the latest development in what is the single most significant ongoing artificial intelligence (AI) related litigation in the UK today.
While this judgment is procedural in nature, it offers us a tantalising glance of what is to come: the final determination of fundamental intellectual property issues in this, our new reality of generative AI (GenAI).
The litigation
In January 2023, provider of internet based visual and digital media content, Getty Images (US) Inc (Getty), brought legal proceedings against Stability AI. Getty claim that 12 million of its photographs, videos and illustrations from its website, (of which over half were original copyright protected artistic works and/or film works (the Works)) were used by Stability AI to train several versions of its deep-learning, text-to-image GenAI model, Stable Diffusion.
Getty’s principal claim is that Stability AI infringed its copyright in the Works in ‘scraping’ from Getty’s website millions of photographs, videos and illustrations without its consent to train and develop Stable Diffusion, and thereafter make it available/offer it for sale to the public.
Getty further claims that the outputs (that is the synthetic images generated by Stable Diffusion) are infringing as they reproduce a substantial part of its Works. Stability AI admits that “…at least some images from the Getty Images Website were used during the training of Stable Diffusion”, but it has not identified those images, as we shall return to.
The judgment
The judgment underscores the complexities and nuances of managing large-scale copyright infringement claims involving numerous parties and the importance of robust case management and class definition in representative actions.
The judgment on 14 January 2025 arises from an application advanced by Stability AI at the Case Management Conference (CMC) held in November 2024. Stability AI sought an order that the Sixth Claimant, Thomas M Barwick Inc, not act as the representative for a class of copyright owners whose works were licensed exclusively to Getty. The court had to determine whether the Sixth Claimant had the same interest as the represented parties and whether the class could be adequately defined.
The court found that the proposed class definition was problematic as it depended on the outcome of the litigation (i.e. whether there had been copyright infringement), making it impermissible. The Claimants' alternative proposal to proceed without joinder of all exclusive licensors was also rejected on the basis that they had not provided sufficient evidence to assure the court that there would be no prejudice to Stability AI from potential future claims.
The court dismissed the representative claim emphasising the need for the parties to find a pragmatic way forward. The court suggested that the Claimants could ask the court again to accept a representative claim with appropriate evidence or consider narrowing the scope of represented parties to manage the case effectively.
Next stages
Proceedings are ongoing. It is understood that the first trial to determine liability is listed to commence on 9 June 2025.
Identification of scale and specific works
In the meantime, the court has considered how claimants might evidence the scale and identify which of their copyrighted works have allegedly been infringed where they form part of datasets utilised by AI firms for GenAI model training.
In its judgment, the court confirmed that these questions were not for determination at the CMC stage, (notwithstanding their significance in connection with the proposed representative claim). Despite this, the court repeatedly referenced both how:
- the subset of copyright protected works used to train Stable Diffusion is a matter within Stability AI’s knowledge; and
- Stability AI is yet to identify how many images from Getty’s website were in fact used to train Stable Diffusion, and which images those were.
Stable AI proposed that the questions of authorship, subsistence and infringement may be resolved by reference to a sample of the Works being isolated and examined before extrapolation of the results. At present, however, there appears to be no consensus as to how this will be achieved.
The court highlighted the need for clear proposals going to how the case would be managed at trial, including the suggested use of sampling to determine which Works were used in training.
It remains to be seen how these evidential issues will be resolved as between the parties and/or by order of the court in advance of trial.
Technical difficulties
The court acknowledged the technical difficulties of interrogating datasets of this size.
Mrs Justice Smith DBE found persuasive the argument that:
- the enormous number of images in the training datasets; and
- the equally enormous number of copyrighted Works in issue,
meant that the practical exercise of trying to determine which Works were within those datasets would be “…wholly disproportionate and practically impossible without significant resources” [79].
Many AI businesses are by their nature, new businesses, and so they may struggle to meet the particulars of a court order as to data interrogation and sampling, particularly in cases of this kind where claimants allege that millions of works have been infringed.
With less than five months to go until trial, the sample and extrapolation approach mooted by Stability AI and the court seems the most pragmatic course of action; albeit this is a step into unknown subject matter for the court.
Significance
The significance of these proceedings cannot be understated.
A final judgment on the substantive issues will likely resolve fundamental legal and ethical questions concerning the use of copyright protected material in training and developing GenAI models, particularly where consent is not obtained from rights-holders.
I anticipate also that the outcome will be impactful, both:
- in terms of AI firms reviewing their systems and processes when it comes to data utilisation (including Master Data Management (MDM)), and whether they wish to initiate or maintain commercial operations in England and Wales; and
- by affecting how creators will approach protecting and monetising their works in the future.