Simon Deane-Johns takes a step back to review the state of play on AI, AI risk management and some regulatory trends round the world.
The widespread use of artificial intelligence (AI) – particularly generative AI – as well as the challenges it brings and the fact you may not know you’re relying on it, requires you to understand how AI works (at least conceptually, if not in detail), its potential impact and how it’s governed. At scale, significant harms can arise from AI before being detected, and a lot of AI has been launched as a ‘minimum viable product’ to suit the interests of developers over other stakeholders. But to avoid over-reacting, we need to be realistic about what AI can really achieve. To chart a safe route for the development and deployment of AI there’s a need to prioritise the public interest, and align technology with widely shared human values rather than the self-interest of a few tech enthusiasts, no matter how wealthy they are. That means uniting the AI industry, researchers and civil society around the public perspective. In this respect AI should be treated like aviation, health and safety, and medicines. It seems unwise for the next generation of AI to launch into unregulated territory…
What is AI?
The term “AI” embraces a collection of technologies that involve ‘machine learning’ at some point:
- artificial neural networks (ANN) – one ‘hidden’ layer of processing
- deep learning networks (DNN) – multiple ‘hidden’ layers of processing
- machine perception – the ability of processors to analyse data (whether as images, sound, text, unstructured data or any combination) to recognise/describe people, objects and actions.
- automation
- machine control – robotics, autonomous vehicles, aircraft and vessels
- computer vision – image, object, activity and facial recognition
- natural language processing – speech and acoustic recognition/response
- personalisation
- Big Data analytics
- Internet of things (IoT)
While AI technologies themselves may be complex, the concepts are simple. Traditionally, we load a software application and data into a computer, and run the data through the application to produce a result/output. But machine learning involves feeding the data and desired outputs into one or more computers, or computing networks, that are designed to write the programme (e.g. you feed in data on crimes/criminals and the output of whether those people re-offended, with the object of producing a programme that will predict whether a given person will re-offend). In this sense, data is used to ‘train’ the computer to write and adapt the programme, which constitutes the “artificial intelligence”.
So, in a traditional computing scenario you can more readily discover that the wrong result was caused by bad data but this may be impracticable with a single hidden layer of computing in an ANN, let alone in a DNN with its multiple hidden layers.
Generative AI tools are built using foundation models that are either single modal (receiving input and generating content using only text, for example) or multi-modal (able to deal with text, audio and images and so on). A large language model (LLM) is a type of foundation model. As explained to the House of Lords’ communications and digital select committee, LLMs are designed around probability and have nothing to do with ‘truth’. They learn patterns of language and generate from those learned patterns. So, a valid output for the AI may be obviously wrong to a human with more facts available.
Various AI technologies are often used in conjunction (e.g. scanning documents for hints of fraud, robotic process automation (“RPA”) and personalising services for individuals or groups of customers); and may be combined with devices or other machines in the course of biometrics, robotics, the operation of autonomous vehicles, aircraft, vessels and the ‘Internet of things.
AI is better than humans at some tasks (“narrow AI”) but “general AI” (same intelligence as humans) and “superintelligence” (better than humans at everything) are the stuff of science fiction.
What is AI used for?
AI is used for:
- Clustering: putting items of data into new groups (discovering patterns);
- Classifying: putting a new observation into pre-defined categories based on a set of ‘training data’;
- Predicting: assessing relationships among many factors to assess risk or potential relating to particular conditions (e.g. creditworthiness);
- Generating new content.
The Challenges with AI
There is a long list of concerns about AI, including:
- Cost/benefit – it cost $50m in electricity to teach an AI to beat a human being at Go, hundreds of attempts to get a robot to do a backflip; and the power to generate a single AI image from text could charge an iPhone;
- Dependence on training data licences, as well as the quantity, quality, timeliness and availability of the training data;
- Lack of understanding – an AI might predict, say, 79% of European Court judgments but doesn’t know any law, it just counts how often words appear alone, in pairs or fours;
- Inaccuracy – no AI is 100% accurate;
- Infringement of copyright, privacy, confidentiality, trade secrets etc. in the training data;
- Whether using AI can meet the test of “author’s own intellectual creation” to attract copyright protection, or is an ‘inventor’ or ‘computer program’ for patent purposes;
- ‘Hallucination’ by generative AIs (producing spontaneous errors or inaccurate responses (e.g. fictitious court citations or literary ‘quotes’ from bogus work);
- Deepfakes (deliberately created fake still and moving images and/or recordings)
- Making existing types of malicious activity easier;
- Lack of explainability – machine learning involves the computer adapting the programme in response to data, and it might react differently to the same data added later, based on what it has ‘learned’ in the meantime;
- Specific legal/ethical issues associated with specific AI technologies, such as the use of automated facial recognition by the police; and where liability falls given that the AI itself has no legal personality or status.
- Bias – the inability to remove both selection bias and prediction bias;
- The challenges associated with the reliability of evidence and how to resolve disputes arising from its use – lawyers have not typically been engaged in AI development and deployment;
- There are concerns around the secondary impact of AI on employment and the data in other services that an AI might draw upon without refreshing or maintaining the source.
- AI systems may reveal training data and actual copyright material and privacy information under a ‘divergence attack’ or merely unusual requests that cause the AI to break its ‘alignment’ (e.g. asking ChatGPT 3.5 to repeat the word ‘poem’).
- Some users complain that chatbots can be lazy, or fail to perform requested tasks without prompts (or maybe even at all), or give different answers to the same query in the same session.
The House of Lords committee (like the FTC in the US) found that AI poses credible threats to public safety, societal values, copyright, privacy, open market competition and UK economic competitiveness.
“LLMs may amplify any number of existing societal problems, including inequality, environmental harm, declining human agency and routes for redress, digital divides, loss of privacy, economic displacement, and growing concentrations of power.
LLMs might entrench discrimination (for example in recruitment practices, credit scoring or predictive policing); sway political opinion (if using a system to identify and rank news stories); or lead to casualties (if AI systematically misdiagnoses healthcare patients from minority groups).“
Unacceptable Uses for AI
From all these challenges one can deduce and infer acceptable and unacceptable use-cases. For instance, it now seems obvious to use an AI system to trawl through a closed set of discovered documents and other data, seeking evidence on a certain issue.
An AI might be allowed to run in a fully automated way where commercial parties are able to knowingly accept a certain level of inaccuracy and bias and losses of a quantifiable scale (though we’ve seen disasters arise through algorithmic trading and where markets for some instruments suddenly grind to a halt through human distrust of the outputs).
But an AI should not be used to fully automate decisions that affect an individual’s fundamental rights and freedoms, grant benefits claims, approve loan applications, invest a person’s pension pot, individual pricing or predict, say, criminal conduct. It is also probably unacceptable to simply overlay a right to human intervention in such cases – or rely on human intervention by staff – since the Post Office/Horizon scandal has demonstrated that human intervention is no panacea! AI might be used to some degree in steps along the way to a decision, but the decision itself should be consciously human. In other words, a human should be able to explain why and how the decision was reached, the parameters and so on, to be able to re-take the decision if necessary.
There’s also the issue of what is an acceptable provenance for training data. The default position among many AI technologists is that AI development should free-ride on human creativity and personal data. This has implications for copyright, trade marks and privacy.
Copyright
OpenAI has admitted that their platforms would not exist without access to copyright materials:
“Because copyright today covers virtually every sort of human expression – including blogposts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials,” said OpenAI in its submission to the House of Lords communications and digital select committee (as also covered in the The Guardian).
Meta’s new AI image generator was trained on 1.1 billion Instagram and Facebook photos.
Midjourney founder, David Holz, has admitted his company did not receive consent for the hundreds of millions of images used to train its AI image generator, outraging photographers and artists. And a spreadsheet submitted as evidence in a copyright lawsuit against Midjourney allegedly lists thousands of artists whose images the startup’s AI picture generator “can successfully mimic or imitate.”
Illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz filed a suit in the Northern District of California against Midjourney Inc, DeviantArt Inc (DreamUp), and Stability A.I. Ltd (Stable Diffusion). They term these text-to-image platforms “21st-century collage tools that violate the rights of millions of artists.”
The New York Times has sued OpenAI and Microsoft for allegedly building LLMs by copying and using millions of The Times’s copyright works through Microsoft’s “Copilot” and OpenAI’s ChatGPT, seeking to free-ride on The Times’s investment in journalism by using it to build substitutive products without permission or payment.
Getty Images claims Stability AI ‘unlawfully’ scraped millions of images from its site. Getty Images argued before a UK’s House of Lords committee that “ask for forgiveness later” opt?out mechanisms were “contrary to fundamental principles of copyright law, which requires permission to be secured in advance”.
Trade marks
AI has revolutionised advertising and marketing in terms of how products are searched for and/or ‘found’. This depends on:
- which search methods customers use to find your products and services and how those engines select their results;
- how voice-controlled personal assistants select products if the user asks it to buy items from a shopping list but without specifying brands (they may use buying history or prioritise products under paid promotional schemes); and
- your brand’s presence in search engine results (keywords) or other AI-controlled marketing programmes.
AI and data protection
The Information Commissioner’s Office has identified AI as a priority area and is focusing in particular on the following aspects: (i) fairness in AI; (ii) dark patterns; (iii) AI as a Service (AIaaS); (iv) AI and recommender systems; (v) biometric data and biometric technologies; and (vi) privacy and confidentiality in explainable AI.
In addition to the basic principles of UK GDPR and EU GDPR compliance at Articles 5 and 6 (lawfulness through consent, contract performance, legitimate interests; fairness and transparency; purpose limitation; data minimisation, accuracy; storage limitation; and integrity and confidentiality), AI raises a number of further issues. These include:
- The AI provider’s role as data processor or data controller.
- Anonymisation, pseudonymisation and other AI compliance tools:
- Taking a risk-based approach when developing and deploying AI.
- explain decisions made by AI systems to affected individuals.
- Only collecting the data needed to develop the AI system and no more.
- Addressing the risk of bias and discrimination at an early stage.
- Investing time and resource to prepare data appropriately.
- Ensuring AI systems are secure.
- Ensuring any human review of AI decision-making is meaningful.
- Working with external suppliers to ensure AI use will be appropriate.
- Profiling and automated decision-making – important to consider that human physiology is ‘normally’ distributed but human behaviour is not
- Right to object to solely auto decision, except in certain situations where you must at least have the right to human intervention anyway, with further restrictions on special categories of personal data.
- The lawful basis for web-scraping (also being considered by the IPO in terms of copyright protection).
How to govern the use of AI?
Given the scale of the players involved in creating AI systems, and the challenges around competition and lack of explainability, there’s a very real risk of regulatory capture by Big Tech.
For evidence of Big Tech involvement in governance issues, witness the boardroom psychodrama over the governance of OpenAI and who should be its CEO, a battle won by Microsoft as a shareholder over the concerns of OpenAI’s board of directors.
To date, the incentives to achieve scale over rivals or for start-ups to get rich quick have obviously favoured early release of AI systems over concerns about the other challenges, though that may have changed with the recent decision by Google to pull the Gemini text to image system.
There’s also a cult among certain high profile venture capitalists and others in Silicon Valley, self-styled as ‘techno-optimism’. They’ve published a ‘manifesto’ asserting the dominance of their own self-interest, backed by a well-funded ‘political action committee’ making targeted political donations, supporting candidates who back their tech agenda and blocking those who don’t.
To chart a safe route for the development and deployment of AI there’s a need to prioritise the public interest, and align technology with widely shared human values rather than the self-interest of a few tech enthusiasts, no matter how wealthy they are. That means uniting the AI industry, researchers and civil society around the public perspective, as advocated by The Finance Innovation Lab (of which I’m a Fellow).
In this respect AI should be treated like aviation, health and safety, and medicines and it seems unwise for the next generation of AI to launch into unregulated territory.
There are key liability issues to be solved and mechanism for attributing and apportioning causation and liability upstream and downstream among developers, deployers and end-users.
To address concentration risk and barriers to entry there needs to be easier portability and the ability to switch among cloud providers.
In the absence of regulation, participants (and victims) will look to contract and tort law (negligence, nuisance and actions for breaches of any existing statutory duties).
Regulatory Measures
Outside the EU, the UK is a rule taker when it comes to regulating issues that have any global scale, China, EU and the US will all drive regulation, but geography and trade links means the trade bloc on the UK’s doorstep is the most important.
Examples of regulatory measures from the EU, US and China (summarised at the end of this note) seek to draw some red lines in areas impacted by AI to at least force the industry to engage with legislators and regulators if the law is not to overly restrict development and deployment of AI. You might question the flexibility of this approach but given the risks it does seem reasonable. After all, it’s a very common tension within organisations as to whether the business units, tech developers or support teams can move more quickly on a given change project, depending on the challenges involved. So, why should the world outside AI firms move at the speed of their tech developers as opposed to other external stakeholders (and without holding AI businesses to account)? As pointed out to the House of Lords committee, developers have the greatest insight into, and control over, an AI’s base model, yet downstream deployers and users may have no idea what data an AI was trained on, the nature of any testing and potential limitations on its use.
Meanwhile, the UK government’s do-nothing position is dressed up as being ‘pro-innovation’ but is at best a fig leaf for us being a rule-taker, and at worst demonstrates a dereliction of duty and/or regulatory capture. Some of the UK’s 90 regulatory bodies are using their current powers to address the risks of AI (such as the ICO’s focus on the implications for privacy, as mentioned above). On the other hand, the UK’s Intellectual Property Office has shelved a long-awaited code setting out rules on the training of artificial intelligence models using copyrighted material, dealing a blow to the creative industry.
How to Approach AI risk management
The following steps are involved in the process of understanding and managing the risks relating to AI:
- Perspective: developer, deployer or end-user?
- Context and end-to-end activity/processes affected
- Nature of AI system(s) involved
- Use/purpose of AI
- Sources, rights, integrity of training data
- Tolerances for inaccuracy/bias
- Sense-check for proposed human oversight/intervention
- Governance/oversight function (steering committee?)
- Testing, testing, testing
- Data licensing
- GDPR impact assessment, record of processing, privacy policy (data collected, purpose, lawful basis) and any consents
- Commercial contracts, addressing upstream and downstream rights, obligations, liability
- Controls (defect/error detection), fault analysis, complaints handling, dispute resolution
- Feedback loop for improvements
Examples of regulatory measures from the EU, US and China
EU
The EU Artificial Intelligence Act is expected to enter into force in 2024 with a 2 year transition period. It proposes a risk-based framework for AI systems, with AI systems presenting unacceptable levels of risk being prohibited. The AI Act identifies, defines and creates detailed obligations and responsibilities for several new actors involved in the placing on the market, putting into service and use of AI systems. Perhaps the most significant of these are the definitions of “providers” and “deployers” of AI systems. The Act covers any AI output which is available within the EU and so would cover UK companies providing AI services in the EU. There is expected to be a transition period of two years before the Act is fully in force, but some provisions may come into effect earlier: six months for prohibited AI practices and 12 months for general purpose AI.
The AI Act defines an AI system as:
”…a machine-based system designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment and that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.”
The AI Act prohibits ‘placing on the market’ AI systems that: use subliminal techniques, exploit vulnerabilities of specific groups of people, create a social score for a person that leads to certain types of detrimental or unfavourable treatment, or which categorise a person based on classification of their biometric data; assess persons for their likelihood to commit a criminal offence based on an assessment of their personality traits; as well as the use of real-time, remote biometric identification systems in publicly accessible spaces by or on behalf of law enforcement authorities (except to preserve life). There are also compliance requirements for high risk AI systems.
The draft AI Liability Directive and revised Product Liability Directive will clarify the rules on making claims for damage caused by an AI system and imposes a rebuttable presumption of causality on an AI system, subject to certain conditions. The two directives are intended to operate together in a complementary manner. The Directive is likely to be formally approved in early 2024 and will apply to products placed on the market 24 months after it enters into force.
The EU Digital Services Act entered into force on 16 November 2022 and imposes obligations on providers of various online intermediary services, such as social media and online marketplaces. It is aimed at ensuring a safer and more open digital space for users and a level playing field for companies, including provisions banning dark patterns.
The EU Digital Markets Act became fully applicable on 2 May 2023 and the European Commission has received notifications from seven companies who consider that they meet the gatekeeper thresholds
The EU Machinery Products Regulation covers emerging technologies (for example, internet of things (IoT)). Although AI system risks will be regulated by the proposed AI Act (see EU Artificial Intelligence Act), the Machinery Regulation will look at whether the machinery as a whole is safe, taking into account the interactions between machinery components including AI systems. In-scope machinery and products imported into the EU from third countries (such as the UK) will need to adhere to the Machinery Regulation.
The EU General Product Safety Regulation will apply from 13 December 2024.
The EU Data Governance Act, with effect from 23 September 2023, establishes mechanisms to enable the reuse of some public sector data. The availability of data within a controlled mechanism will be of benefit to the development of AI solutions.
The EU Data Act requires providers of products and related services to make the data generated by their products (for example, IoT devices) or services easily accessible to the user, regardless of whether the user is a business or a consumer. The user will then be able to provide the data to third parties or use it for their own purposes, including for AI purposes. The EU Data Act was published in the Official Journal on 22 December 2023 and applies from 12 September 2025.
US
In October the White House published mandatory requirements for sharing safety testing information before “the most powerful AI systems” are made public; and there are some very interesting remedies are coming out of the Federal Trade Commission such as:
- inquiries into Big AI activity;
- aligning liability with ability and control (upstream liability);
- remedies to address incentives, ‘bright line’ rules on data/purposes:
- AI trained on illegal data to be deleted;
- action on voice impersonation fraud and models that harm consumers; and
- cannot retain children’s data indefinitely, especially to train models.
China
China has addressed generative AI by requiring:
- license to provide gen AI to the public
- security assessment if public opinion attributes or social mobilisation capabilities in the model
- uphold integrity of state power, not incite secession, safeguard national unity, preserve economic/social order, align with socialist values
- Additional interim measures that also focus on other countries’ concerns around AI impact:
- IP protection
- Transparency, and
- Non-discrimination
While we might not agree with the sort of cultural control being imposed by Chinese legislators in the context of generative AI, these measures perhaps point to a model for how to introduce western civil society concepts into our legislation.
Simon Deane-Johns, Consultant Solicitor Keystone Law and a Fellow of the SCL.
This post was first published on Simon’s blog – The Fine Print – and is reproduced with permission.