ICO responds to consultation series on generative AI

December 19, 2024

In January 2024, the ICO launched its five-part generative AI consultation series. It has now published its consultation response. The series set out to address regulatory uncertainties about how specific aspects of the UK GDPR and the DPA 2018 apply to the development and use of generative AI. It did that by setting out the ICO’s initial analysis of these areas, along with the positions it wanted to consult on.

The ICO retained its position on purpose limitation, accuracy and controllership.

It updated its position on the legitimate interests lawful basis for web scraping to train generative AI models.

It heard that data collection methods other than web scraping exist, which could potentially support the development of generative AI. An example is where publishers collect personal data directly from people and license this data in a transparent way. It is for developers to demonstrate the necessity of web scraping to develop generative AI. The ICO will continue to engage with developers and generative AI researchers on the extent to which they can develop generative AI models without using web-scraped data.

Web scraping is a large-scale processing activity that often occurs without people being aware of it. The ICO says that this sort of invisible processing poses particular risks to people’s rights and freedoms. For example, if someone doesn’t know their data has been processed, they can’t exercise their information rights. The ICO received minimal evidence on the availability of mitigation measures to address this risk. This means that, in practice, generative AI developers may struggle to demonstrate how their processing meets the requirements of the legitimate interests balancing test. As a first step, the ICO expects generative AI developers to significantly improve their approach to transparency. For example, they could consider what measures they can provide to protect people’s rights, freedoms and interests. This could involve providing accessible and specific information that enables people and publishers to understand what personal data the developer has collected. The ICO also expects them to test and review these measures.

The ICO received evidence that some developers are using licences and terms of use to ensure deployers are using their models in a compliant way. However, to provide this assurance, developers will need to demonstrate that these documents and agreements contain effective data protection requirements, and that these requirements are met.

The ICO updated its position on engineering individual rights into generative AI models.

The ICO says that organisations acting as controllers must design and build systems that implement the data protection principles effectively and integrate necessary safeguards into the processing. This would put organisations in a better place to comply with the requirement to facilitate people’s information rights.

Article 11 of the GDPR (on processing which does not require identification) may have some relevance in the context of generative AI. However, organisations relying on it need to demonstrate that their reliance is appropriate and justified. For example, they must demonstrate they are not able to identify people. They must also give people the opportunity to provide more information to enable identification.

The response also highlights areas where the ICO thinks further work is needed to develop and inform its thinking. It also recognises that the upcoming Data (Use and Access) Bill may affect the positions in the paper. Following the changes to data protection law through the Data (Use and Access) Bill, it will update and consult on its wider AI guidance to reflect the changes and include generative AI.

Its final positions will also align with its forthcoming joint statement on foundation models with the Competition and Markets Authority. This statement will touch on the interplay of data protection and competition and consumer law in this complex area.