Alex Read, WFD Associate
The debate around catastrophic risk from frontier AI
The UK AI Safety Summit has a focus on risks from frontier AI systems – the potential for increasingly powerful AI systems to cause catastrophic impacts on society.
The Centre for AI Safety categorises catastrophic risk as follows, with examples:
- Malicious use, in which individuals or groups intentionally use AI systems to cause harm.
- AI used to create bioweapons.
- AI is used to spread persuasive propaganda.
- AI enables censorship and mass surveillance to help concentrate power.
- AI race, in which competitive environments compel actors to deploy unsafe AI systems or cede control to them.
- Corporate rush to replace human jobs with AI systems.
- Release of harmful systems, such as for automated warfare.
- Organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents.
- Safety culture problems.
- Leaked AI systems.
- Insufficient security in AI labs.
- Rogue AIs, describing the inherent difficulty in controlling agents far more intelligent than humans.
- AI systems pursue a dangerous proxy goal that does not align with the intended goal.
- AI systems seek power and take control of their environment.
- AI systems deceive humans and manipulate public discourse and politics.
What are the chances of catastrophic risk being realised? It is a clear concern among some in the AI industry. AI pioneers Geoffrey Hinton and Yoshua Bengio have argued that the timeline towards AGI was shorter than they envisaged. This raises risks around ‘rogue AI systems’ that develop their own sub-goals which manipulate people, give themselves greater control and even threaten human existence. In 2011, DeepMind’s chief scientist, Shane Legg described the existential threat posed by AI as the “number one risk for this century, with an engineered biological pathogen coming a close second”. In a recent interview, Anthropic CEO Dario Amodei said that chances of an AI system going “catastrophically wrong on the scale of … human civilisation” was between 10% and 25%.
What might this look like? Even if asked to achieve important and beneficial goals, a sufficiently powerful and autonomous AI could pursue dangerous and damaging objectives not aligned with human interests. Two examples are painted by computer scientist professor and author Stuart Russell.
Let’s suppose … that we ask some future superintelligent system to pursue the noble goal of finding a cure for cancer—ideally as quickly as possible, because someone dies from cancer every 3.5 seconds. Within hours, the AI system has read the entire biomedical literature and hypothesized millions of potentially effective but previously untested chemical compounds. Within weeks, it has induced multiple tumors of different kinds in every living human being so as to carry out medical trials of these compounds, this being the fastest way to find a cure. Oops.
If you prefer solving environmental problems, you might ask the machine to counter the rapid acidification of the oceans that results from higher carbon dioxide levels. The machine develops a new catalyst that facilitates an incredibly rapid chemical reaction between ocean and atmosphere and restores the oceans’ pH levels. Unfortunately, a quarter of the oxygen in the atmosphere is used up in the process, leaving us to asphyxiate slowly and painfully. Oops.
Is catastrophic risk from frontier AI realistic? There is plenty of push back from industry figures who argue existential-type risks are conjecture.
- François Chollet, Google AI researcher: “There does not exist any AI model or technique that could represent an extinction risk for humanity…not even if you extrapolate capabilities far into the future via scaling laws”.
- Yann LeCun, Meta Chief AI Scientist: “until we have a basic design for even dog-level AI (let alone human level), discussing how to make it safe is premature”.
- Joelle Pineau, senior Meta AI leader, has said existential risk discussions are “unhinged” and warned that “when you put an infinite cost, you can’t have any rational discussion about any other outcomes”
A contrasting argument, proposed by organisations such as the Distributed AI Research Institute, is that catastrophic risk is a distraction from the harms and near-term risks that are being realised as AI and automated systems are introduced into society. This argument asserts that we should not fall for AI hype and should focus regulatory efforts on ensuring transparency, accountability and preventing exploitative labour practices from current AI systems. In AI development, we should build and test effective AI systems and prevent the pursuit of AGI.