As part of its series of AI blog posts, the ICO has issued a blog post on access, erasure, and rectification rights in AI systems. It describes the issues arising for organisations seeking to comply with certain rights for individuals under the GDPR and where exemptions may apply.
Rights relating to training data
Organisations creating machine learning models will need data to train those models, such as a retailer using a model to predict purchases based on previous consumer transactions. It may be difficult to identify individuals to whom the training data relates. However, that does not necessarily mean that the data is pseudonymised or anonymised, and so it must still be considered when responding to requests under the GDPR. Even if it lacks identifiers or contact details, it may still be possible to link it to a particular individual. An example might be if a customer provided a list of recent purchases, the organisation may be able to identify the training data relating to that individual. In such circumstances the organisation must respond to a data subject’s request, subject to ID checks, and there being no other applicable exceptions. If the organisation cannot identify the individual, it will not be able to fulfil a request.
The blog post points out that requests for access, rectification or erasure of training data should not be regarded as manifestly unfounded or excessive, either because they may be harder to fulfil or because the motivation for requesting them may be unclear compared with other access requests an organisation typically receives.
Right to rectification
The right to correct inaccurate data may also apply to training data. However, the purpose of training data is to train models based on general patterns in large datasets, so individual inaccuracies are less likely to have any direct effect on an individual data subject.
Right to erasure
Organisations may also receive requests to erase training data. Organisations must respond to requests for erasure, subject to no exemptions applying, and provided the data subject has appropriate grounds. For example, if the training data is no longer needed because the model has already been trained, the organisation must fulfil the request. However, in some cases, the system may still be being developed, and so the organisation may still need to use training data. The organisation will need to consider if it can fulfil erasure requests on a case by case basis.
Rights relating to personal data involved in AI systems during deployment
Often, the outputs of an AI system will be stored in an individual’s profile and used to take some action in relation to the individual concerned. If those outputs are personal data, they are subject to the rights of access, rectification, and erasure. Accuracy of data is key here, as using the wrong data to make decisions about an individual could adversely affect them, so rectification is the key right for organisations to prioritise.
Rights relating to the model itself
Personal data might also be contained in a model itself. This could happen by design or by accident.
Where a model contains data by design, if an individual’s request is for access to the data, an organisation may be able to fulfil that request without altering the model. However if an individual requests rectification or erasure of the data, the organisation may have to re-train the model (either with the rectified data, or without the erased data), or delete the model altogether.
It is possible for some models to ‘leak’ personal data by accident. In such cases, unauthorised parties may be able to recover elements of the training data or infer who was in it by analysing the way the model behaves. It may be difficult to fulfil rights of access, rectification, and erasure in these scenarios. Unless the data subject has evidence that their personal data could be inferred from the model, the organisation may not be able to confirm that the request has any basis.