OpenAI launched GPT-4 on 14th March, and its capabilities were shocking to people within the AI community and beyond. A week later, the Future of Life Institute (FLI) published an open letter calling on the world’s leading AI labs to pause the development of even larger GPT (generative pre-trained transformer) models until their safety can be ensured. Geoff Hinton went so far as to resign from Google in order to be free to talk about the risks.
Recent episodes of the London Futurists Podcast have presented the arguments for and against this call for a moratorium. Jaan Tallinn, one of the co-founders of FLI, made the case in favour. Pedro Domingos, an eminent AI researcher, and Kenn Cukier, a senior editor at The Economist, made variants of the case against. In the most recent episode, David Wood and I, the podcast co-hosts, summarise the key points and give our own opinions. The following nine propositions and questions are a framework for that summary.
1. AGI is possible, and soon
The arrival of GPT-4 does not prove anything about how near we are to developing artificial general intelligence (AGI), an AI with all the cognitive abilities of an adult human. But it does suggest to many experts and observers that the challenge may be less difficult than previously thought. GPT-4 was trained on a vast corpus of data – most of the internet, apparently – and then fine tuned with guidance from humans checking its answers to questions. The training took the form of an extended game of “peekaboo” in which the system hid words from itself, and tried to guess them from their context.
The result is an enormously capable prediction machine, which selects the next best word in a sentence. Many people have commented that to some degree, this appears to be what we do when speaking.
Opinion among AI researchers is divided about what is required to get us from here to AGI. Some of them think that continuing to scale up deep learning systems (including transformers) will do the trick, while others think that whole new paradigms will be needed. But the improvement from GPT-2 to 3, and then to 4, suggests to many that we are nearer than we previously thought, and it is high time to start thinking about what happens if and when we get there. The latest median forecast on the Metaculus prediction market for the arrival of full AGI is 2032.
2. AGI is an X-risk
It is extremely unlikely that humans possess the greatest possible level of intelligence, so if and when we reach AGI, the machines will push past our level and become superintelligences. This could happen quickly, and we would soon become the second-smartest species on the planet by a significant margin. The current occupants of that position are chimpanzees, and their fate is entirely in our hands.
We don’t know whether consciousness is a by-product of sufficiently complex information processing, so we don’t know whether a superintelligence will be sentient or conscious. We also don’t know what would give rise to agency, or self-motivation. But an AI doesn’t need to be conscious or have agency in order to be an existential risk (an X-risk) for us. It just needs to be significantly smarter than us, and have goals which are problematic for us. This could happen deliberately or by accident.
People like Eliezer Yudkowsky, the founder of the original X-risk organisation, now called the Machine Intelligence Research Institute (MIRI), are convinced that sharing the planet with a superintelligence will turn out badly for us. I acknowledge that bad outcomes are entirely possible, but I’m not convinced they are inevitable. If we are neither a threat to a superintelligence, nor a competitor for any important resource, it might well decide that we are interesting, and worth keeping around and helping.
3. Four Cs
The following four scenarios capture the possible outcomes.
• Cease: we stop developing advanced AIs, so the threat from superintelligence never materialises. We also miss out on the enormous potential upsides.
• Control: we figure out a way to set up advanced AIs so that their goals are aligned with ours, and they never decide to alter them. Or we figure out how to control entities much smarter than ourselves. Forever.
• Consent: the superintelligence likes us, and understands us better than we understand ourselves. It allows us to continue living our lives, and even helps us to flourish more than ever.
• Catastrophe: either deliberately or inadvertently, the superintelligence wipes us out. I won’t get into torture porn, but extinction isn’t the worst possible outcome.
4. Pause is possible
I used to think that relinquishment – pausing or stopping the development of advanced AIs – was impossible, because possessing a more powerful AI will increasingly confer success in any competition, and no company or army will be content with continuous failure. But I get the sense that most people outside the AI bubble would impose a moratorium if it was their choice. It isn’t clear that FLI has got quite enough momentum this time round, but maybe the next big product launch will spark a surge of pressure. Given enough media attention, public opinion in the US and Europe could drive politicians to enforce a moratorium, and most of the action in advanced AI is taking place in the US.
5. China catching up is not a risk
One of the most common arguments against the FLI’s call for a moratorium is that it would simply enable China to close the gap between its AIs and those of the USA. In fact, the Chinese Communist Party has a horror of powerful minds appearing in its territory that are outside its control. It also dislikes its citizens having tools which could rapidly spread what it sees as unhelpful ideas. So it has already instructed its tech giants to slow down the development of large language models, especially consumer-oriented ones.
6. Pause or stop?
The FLI letter calls for a pause of at least six months, and when pressed, some advocates admit that six months will not be long enough to achieve provable permanent AI alignment, or control. Worthwhile things could be achieved, such as a large increase in the resources dedicated to AI alignment, and perhaps a consensus about how to regulate the development of advanced AI. But the most likely outcome of a six-month pause is an indefinite pause. A pause long enough to make real progress towards permanent provable alignment. It could take years, or decades, to determine whether this is even possible.
7. Is AI Safety achievable?
I’m reluctant to admit it, but I am sceptical about the feasibility of the AI alignment project. There is a fundamental difficulty with the attempt by one entity to control the behaviour of another entity which is much smarter. Even if a superintelligence is not conscious and has no agency, it will have goals, and it will require resources to fulfil those goals. This could bring it into conflict with us, and if it is, say, a thousand times smarter than us, then the chances of us prevailing are slim.
There are probably a few hundred people working on the problem now, and the call for a pause may help increase this number substantially. That is to be welcomed: human ingenuity can achieve surprising results.
8. Bad actors
In a world where the US and Chinese governments were obliging their companies and academics to adhere to a moratorium, it would still be possible for other actors to flout it. It is hard to imagine President Putin observing it, for instance, or Kim Jong Un. There are organised crime networks with enormous resources, and there are also billionaires. Probably, none of these people or organisations could close the gap between today’s AI and AGI at the moment, but as Moore’s Law (or something like it) continues, their job would become easier. AI safety researchers talk about the “overhang” problem, referring to a future time when the amount of compute power available in the world is sufficient to create AGI, and the techniques are available, but nobody realises it for a while. The idea of superintelligence making its appearance in the world controlled by bad actors is terrifying.
9. Tragic loss of upsides
DeepMind, one of the leading AI labs, has a two-step mission statement: step one is to solve intelligence – i.e., create a superintelligence. Step two is to use that to solve every other problem we have, including war, poverty, and even death. Intelligence is humanity’s superpower, even if the way we deploy it is often perverse. If we could greatly multiply the intelligence available to us, there is perhaps no limit to what we can achieve. To forgo this in order to mitigate a risk – however real and grave that risk – would be tragic if the mitigation turned out to be impossible anyway.
Optimism and pessimism
Nick Bostrom, another leader of the X-risk community, points out that both optimism and pessimism are forms of bias, and therefore, strictly speaking, to be avoided. But optimism is both more fun and more productive than pessimism, and both David and I are optimists. David thinks that AI safety may be achievable, at least to some degree. I fear that it is not, but I am hopeful that Consent is the most likely outcome.