As part of its series of blog posts on AI, the ICO has published a blog post on data minimisation and privacy-preserving techniques in AI systems.
The blog post points out that AI systems generally require large amounts of data. However, organisations must comply with the minimisation principle under data protection law if using personal data. This means ensuring that any personal data is adequate, relevant and limited to what is necessary for the purposes for which it is processed.
What is adequate, relevant and necessary in relation to AI systems will be use-case specific. However, there are a number of techniques that organisations can adopt in order to develop AI systems which process as little personal data as possible, while still remaining functional, which are described in the blog post.
Data minimisation requirements must be fully considered from the design phase, or as part of the procurement process due diligence if an organisation is buying in a system rather than designing it itself. Data minimisations must also be balanced with other compliance or utility objectives, for example, making more accurate and non-discriminatory ML models (considered in the ICO’s previous blog post about trade-offs).
There are conceptual and technical similarities between data minimisation and anonymization. In some cases, application of privacy-preserving techniques means that certain data used in ML systems is rendered pseudonymous or anonymous. The ICO’s Anonymisation Code of Practice can provide organisations with information on these concepts. The ICO is also currently developing new updated guidance on anonymisation to take into account of new recent developments and techniques in this field.