Within the academic
community, the EU General
Data Protection Regulation (GDPR) has triggered a
lively debate regarding whether data subjects have a ‘right to explanation’ of
automated decisions made about them. At one end of the spectrum, we see
scholars arguing that no such right exists under the GDPR but rather a ‘limited
right to information’ only.[1]
Others argue that this position is based on a very narrow reading of the
relevant provisions of the GDPR, and that a contextual interpretation shows
that the GDPR does indeed provide for a right to explanation with respect to
automated decisions.[2]
We wholeheartedly agree with the latter interpretation and set out why below.
That being said, we think that all sides are missing the broader context.
Accountability
requirement
Providing upfront information on automated
decision-making and the underlying logic to individuals or an
explanation to individuals of automated decisions
after these are made is one thing; the GDPR’s accountability requirement[3]
requires that controllers are able to demonstrate
compliance with their material obligations under the GDPR, in particular that
their processing of personal data meets the requirements of:
- lawfulness,
fairness and transparency; - data
accuracy; - purpose
limitation; - data
minimisation and storage limitation; - automated
decision-making (which requires establishing appropriate safeguards); and - performing
a data protection impact assessment.[4]
Applying these requirements to automated
decision-making requires controllers to be able to demonstrate that the
correlations applied in the algorithm as ‘rules’ for decision-making are
meaningful (eg, no over-reliance
on correlations without proven causality) and unbiased (not discriminatory) and
are therefore a legitimate justification for the
automated decisions about individuals. We recall that transparency to
individuals and the right of individuals to, for example, access their data
primarily serve the purpose of enabling individuals to decide whether to
exercise their (other) rights, such as objecting to profiling,[5]
requesting erasure or rectification of their profile[6]
or ‘contesting’ any automated decisions relating to them.[7]
The accountability principle requires controllers to (subsequently) demonstrate their compliance with their material
GDPR obligations. The question of whether the GDPR does or does not provide individuals
with a right to an explanation of automated decisions relating to them is
therefore missing the point that, in the end, controllers must be able to show
that the correlations applied in the algorithm can legitimately be used as a justification for the automated decisions.
To give a very simple example, an
explanation for the underlying logic of a decision may be that the relevant
individual is from a specific ethnic minority. The individual may then contest
this decision as being discriminatory. The controller will subsequently have to
demonstrate that using this ‘rule’ for the relevant decision does not
constitute unlawful discrimination to be able to continue such processing.
Algorithmic
accountability
To meet their
obligations with regard to automated decision-making, controllers will need to design,
develop and apply their algorithms in a transparent, predictable and verifiable
manner (coined ‘algorithmic accountability’ by Diakopulos and Friedler).
In this sense, ‘The algorithm did it’
is not an acceptable excuse. In the words of Diakopulos and
Friedler: ‘Algorithmic accountability implies an obligation to report and
justify algorithmic decision-making and to mitigate any negative social impacts
or potential harms.’[8]
These concerns are
not limited to EU law. The US Federal Trade Commission has issued recommendations[9] that
promote similar principles of lawfulness and fairness when using algorithms in decision-making,
and US scholars have addressed the issue that automated decision-making in the
employment context may result in a disparate impact for protected classes,
which may be violating US anti-discrimination laws.[10]
Some of these scholars argue that this requires assessing and addressing
potential disparate impact upfront as an anti-discrimination measure.[11]
In any event, for companies to be able to fend off a disparate impact claim, they
must be able to show that the disparate impact is justifiable and not unlawful.
This requires assessing and addressing potential disparate impact upfront at
the start of the development of the automated decision-making system.
Similarly, in the EU,
individuals can dispute an automated decision relating to them as being unfair,
eg, because it is discriminatory. If the controller is unable to show that the
correlations applied in the algorithm are meaningful, unbiased and a legitimate
justification for the relevant decision, the data protection
authorities (DPAs) will likely start
an investigation into the decision rules applied by the algorithm.
In the words of the
Norwegian DPA in its report
on artificial intelligence and privacy:[12]
‘An organization must be able to explain and document, and in some
cases, demonstrate, that they process personal data in accordance with the
rules (…) If the DPA suspects that the account given by an
organisation is wrong or contains erroneous information, it can ask the
organisation to verify the details of its routines and assessments, for example
by having the organisation demonstrate how their system processes personal
data. This may be necessary when, for example, there is a suspicion that an
algorithm is using data that the organisation has no basis for processing, or
if there is a suspicion that the algorithm is correlating data that will lead
to a discriminatory result.’
What
is the issue: information or explanation?
The right to information
The GDPR (Articles
13(2)(f) and 14(2)(g)) explicitly requires controllers using personal data to
make automated decisions to:
1. inform
the individuals upfront about the automated decision-making
activities; and
2. provide the individuals with meaningful information about the logic
involved, the significance of the decision-making and
the envisaged consequences for those individuals.
‘Meaningful information about the logic involved’: In its Opinion on
Automated Decision-Making and Profiling, the Article 29 Working Party (WP29) acknowledges that the ‘growth
and complexity of machine-learning can make it challenging to understand how
automated decision-making process or profiling works,’[13]
but that, despite this, ‘the company should find simple ways to
tell the individual about the rationale behind, or the criteria relied on in reaching,
the decision without necessarily always attempting a complex explanation of the
algorithms used or disclosure of the full algorithm.’[14]
We
note that for the controller to be able to inform the individual about the
criteria relied on for automated decision-making, we require the controller to
know what these criteria are in the first place. In other words, to that extent,
the algorithm may not be a ‘black box.’
The right to an explanation
Article 22(3) GDPR
requires a controller to implement suitable safeguards when designing automated
decisions, which should include at least the right to obtain human
intervention, to express his or her point of view and to contest the decision.
Recital 71 GDPR mentions an extra safeguard: the right to an explanation of a
specific automated decision.
The authors who claim
that Article 22 GDPR does not provide the right to an explanation point out
that this right is included only in the GDPR’s preamble and not in its articles.[15]
As confirmed by the European Court of Justice, the preamble indeed has no legally
binding force.[16]
However, the ECJ explained that this does not deprive the preamble of all meaning;
it merely prohibits the use of the preamble to interpret a provision in a manner
clearly contrary to its wording.[17]
Article 22(3) GDPR
specifies the safeguards that must at least be included in the design of automated
decisions. This
wording quite clearly leaves room for adopting other safeguards, such as the
right to an explanation of a specific automated decision mentioned in Recital
71.
This view is
supported by both the WP29 in its Opinion on Automated Decision-Making and
Profiling[18]
and the Norwegian DPA in its report on Artificial Intelligence and Privacy. In
the words of the latter:[19]
‘Regardless of what the differences in
language mean [ie whether Article 22 GDPR provides the right
to an explanation or not],
the controller must provide as much information as necessary in order for the
data subject to exercise his or her rights. This means that the decision must
be explained in such a way that the data subject is able to understand the
result.
The right to an explanation does not
necessarily mean that the black box must be opened, but the explanation has to
enable the data subject to understand why a particular decision was reached, or
what needs to change in order for a different decision to be reached.’
The latter is also
known as a ‘counterfactual explanation,’ described by Wachter, Mittelstadt and Russell.[20]
A counterfactual explanation could be, for an individual whose application for
a loan has been denied and wants to know why, that the income statements provided
by the individual show a yearly income of EUR 50,000, and the loan would be
granted with yearly income of EUR 60,000 or more.
Again,
in order for the controller to explain the decision in such a way that the
individual understands the result (and knows what to change to get a different
result), the controller needs to know what the ‘rules’ are that led to the
relevant decision (ie the algorithm may not be a ‘black box’).
Algorithmic accountability requires ‘white-box’
development
Although it is far
from set in stone what ‘white-box’ development would require, there are some
guidelines to take into account when developing algorithms for automated
decision-making (see the information on white-box development below). By
documenting these steps and assessments, the controller will also comply with
the requirement to perform a data protection impact assessment.
In the words of the WP29:[21]
‘Controllers should carry out frequent
assessments on the data sets they process to check for any bias, and develop
ways to address any prejudicial elements, including any over-reliance on
correlations.
Systems that audit algorithms and regular
reviews of the accuracy and relevance of automated decision-making including
profiling are other useful measures. Controllers should introduce appropriate
procedures and measures to prevent errors, inaccuracies or discrimination on
the basis of special category data. These measures should be used on a cyclical
basis; not only at the design stage, but also continuously, as the profiling is
applied to individuals. The outcome of such testing should feed back into the
system design.’
Conclusion: information, explanation or justification?
This article discusses
the obligations of controllers with regard to automated decision-making; must
they provide information, an explanation or a justification? The answer is: all
three. The main underlying rationales of EU data protection laws are preventing
information inequality and information injustice. These rationales can only be
served if controllers cannot hide behind algorithms for automated individual
decision-making. Controllers will be accountable for the outcome and will
therefore have to be able to ultimately justify the criteria based on which
automated decision-making takes place. As indicated at the start, we therefore
think the academic debate on the rights of individuals alone misses the bigger
picture, with the risk that companies do the same.
Algorithmic accountability requires white-box development:
1. A clear, documented design for development at the outset (covering the
elements below).
2. Verification from the outset that the dataset applied for the training
of the algorithm is:
Representative (no missing information from particular populations and verification
that there are no hidden unlawful biases that are having an unintended impact
on certain populations).[22]
Accurate and up to date (data collected in another context may be up-to-date but still lead to
inaccurate outcomes).[23]
Note that using an
existing, unmodified, data set is likely to result in unlawful bias, simply
because current situations are rarely unbiased, and this existing bias is
rarely lawful. For example, using a data set of all primary school teachers in
the Netherlands will result in an unlawful bias because the algorithm will
determine that women are better qualified for this job than men because women
are overrepresented in the data set. Unlawful bias can be removed from a data
set by, eg:
·
Removing data elements that indicate group
membership and near proxies thereof. These data elements
include direct identifiers of group membership, such as gender, race, religion
and sexual orientation. Proxy identifiers may, eg, be neighbourhood (often
proxy for race) or specific job titles (nurse and navy officer).
·
Decide on the target variables before starting
to select the training data. The controller needs to decide upfront which variables are thought relevant for the relevant selection. If, for
example, for recruitment purposes personality traits are included in the
selection, such traits must be important enough to job performance to justify
their use.[24]
Even if automated feature selection methods are used, the final decision to use
or not use the results, as well as the choice of feature selection method and
any fine-tuning of its parameters, are choices made by humans. These variables
need to be documented and must ‘pass the smell test,’ ie they must be
intuitively relevant and important enough to job performance to be used.[25]
For example, a correlation between job applicants using browsers that did not
come with the computer (like Firefox) and better job performance and retention
will likely not be acceptable.[26]
·
Adding or modifying elements that result in an
unlawful bias. Instead of deleting group membership, group
membership indicators can also be modified.[27]
For example, in the group of primary school teachers, the gender of a specific
amount of teachers can be reversed to remove bias. Alternatively, if a certain
minority is underrepresented in the data set, this can be compensated by
oversampling these underrepresented communities.[28]
·
Repairing of attributes. An example of an attribute that is often biased are SAT scores (research shows that SAT scores are often biased against women due to
negative assumptions about the abilities of women, and the resulting
stereotyping tends to have a real effect on the outcomes).[29]
This can be remedied by splitting the group of scores
achieved by men and women and dividing each into quantiles (eg, top 5%). Then a
median score can be calculated for each quantile and attributed to both women
and men in such quantile.
3. Review the outcome of the algorithm (and correlations found) at set
stages for unlawful bias and disparate impact and, where present, remove this:
·
Justifiable correlations. Not all correlations found by an algorithm are meaningful, nor can
they legitimately be used as a justification for outcomes.
4.
Consider whether the algorithm can be used in
ways that prevent unlawful discrimination. In the recruitment context, consider,
for example, blind curation of CVs, eg, eliminate names, gender, school names and
geographical information from the CV before selection of a relevant candidate
pool.
5. Ensuring auditability of the algorithm.
Lokke Moerel is Senior of Counsel at Morrison
& Foerster in Berlin and Professor of Global ICT Law at Tilburg University.
Marijn Storm is Associate at Morrison &
Foerster in Brussels.
This article was first published on
the Oxford Business Law Blog in its ‘Law and Autonomous System Series’
on April 27, 2018.
[1] S.
Wachter, B. Mittelstadt and L. Floridi, ‘Why a Right to Explanation of Automated
Decision-Making Does Not Exist in the General Data Protection Regulation,’ International
Data Privacy Law 2017, https://ssrn.com/abstract=2903469. See earlier: B.
Goodman and S. Flaxman, ‘European Union regulations on algorithmic
decision-making and a ‘right to explanation’’ (2016), arXiv:1606.08813.
[2] A.
Selbst and J. Powles, ‘Meaningful information and the right to explanation,’ International
Data Privacy Law 2017, https://doi.org/10.1093/idpl/ipx022.
[3]
See Articles 5(5) and 22 GDPR.
[4]
See in detail on these requirements in respect of automated decision making
based on profiling: Article 29 Working Party, WP251, Guidelines on Automated
individual decision-making and Profiling for the purposes of Regulation
2016/679 (WP29 Opinion on Automated Decision-making),
http://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=612053.
[5]
Article 21 GDPR.
[6]
Article 17 GDPR; See WP29 Opinion on Automated Decision-making, p. 23: ‘the company should allow the data subject
the right to (…) in certain circumstances erase the profile or personal data
used to create it.’
[7]
See specifically Article 22(3) GDPR.
[8] N.
Diakopoulos and S. Friedler, ‘How to Hold Algorithms Accountable,’ MIT
Technology Review (2016), https://www.technologyreview.com/s/602933/how-to-hold-algorithms-accountable/.
[9] U.S.
Federal Trade Commission, ‘Big Data: A Tool For Inclusion or Exclusion?’
(2016), https://www.ftc.gov/system/files/documents/reports/big-data-tool-inclusion-or-exclusion-understanding-issues/160106big-data-rpt.pdf.
[10] See,
for example, S. Barocas and A. Selbst, ‘Big Data’s Disparate Impact,’
California Law Review (2016), DOI: http://dx.doi.org/10.15779/Z38BG31/; and I.
Ajunwa, S. Friedler, C. Scheidegger and S. Venkatasubramanian, ‘Hiring by
Algorithm: Predicting and Preventing Disparate Impact’ (2016), http://friedler.net/papers/SSRN-id2746078.pdf.
[11] Ajunwa
et al. 2016, p. 1.
[12] Datatilsynet
– The Norwegian Data Protection Authority, ‘Artificial intelligence and privacy’
(2018), https://www.datatilsynet.no/en/about-privacy/reports-on-specific-subjects/ai-and-privacy/,
p. 23.
[13] WP29
Opinion on Automated decision-making, p. 25.
[14] Ibid.
[15] Wachter et al. 2017, p. 9.
[16]
European Court of Justice, Case C-134/08, ‘Hauptzollamt
Bremen v J.E. Tyson Parketthandel GmbH hanse j., [2009], paragraph 16.
[17]
European Court of Justice, Case C-308/97, Manfredi
v Puglia [1998], paragraph 30.
[18] WP29
Opinion on Automated decision-making, p. 19, fn 32 and p. 27.
[19] Datatilsynet
– The Norwegian Data Protection Authority 2018, pp. 21-22.
[20]
S. Wachter, B. Mittelstadt and C. Russell, ‘Counterfactual Explanations without
Opening the Black Box: Automated Decisions and the GDPR,’ Harvard Journal of
Law & Technology (2018), https://arxiv.org/pdf/1711.00399.
[21]
WP29 Opinion on Automated decision-making, p. 26.
[22] WP29
Opinion on automated decision-making, pp. 11-12.
[23] Ibid.,
pp. 17-18.
[24] Barocas
& Selbst 2016, p. 709.
[25] Ajunwa
et al., Hiring by
Algorithm: Predicting and Preventing Disparate Impact,
p. 14.
[26]
U.S. Federal Trade Commission, Big Data: A tool for inclusion or exclusion?,
pp. 10-11.
[27] Calders
& Verwer, Three naive Bayes approaches for discrimination-free
classification, p. 281, available at: https://link.springer.com/article/10.1007/s10618-010-0190-x.
[28] Barocas & Selbst 2016, p. 718.
[29] Ajunwa et al. 2016, p. 23-24.