Alessandro Curioni of IBM

AI is firmly back in the news

OpenAI’s ChatGPT and picture generating AI systems like MidJourney and Stable Diffusion have got a lot more people interested in advanced AI and talking about it. Which is a good thing. It will not be pretty if the transformative changes that will happen in the next two or three decades take most of us by surprise.

One company that has been pioneering advanced AI for longer than most is IBM. One of IBM’s most senior executives, Alessandro Curioni, joined the London Futurists Podcast to discuss IBM’s current projects in AI, quantum computing, and related areas. Alessandro has been with IBM for 25 years. He is an IBM Fellow, Director of IBM Research, and Vice President for Europe and Africa.

IBM’s grand challenges

IBM has been inventing the future of computing for 70 years. It has staged a series of impressive grand challenges, like Deep Blue beating Gary Kasparov at chess in 1996, and Watson beating Ken Jennings at the TV quiz Jeopardy in 2011. More recently, in 2018, the company developed a machine capable of holding its own in debates with a world debating champion.

The first of these machines were rules-based, and the later ones use deep learning, which create models trained on large amounts of data. Another paradigm shift is happening now, with the arrival of large language models (LLMs), or foundation models, which use a technique called self-supervision to do the training. The system will take a vast amount of sentences – hundreds of billions of them – from the web, randomly mask one word from each sentence, and try to guess what the word is. Over time the system builds a model of which words go into which sentences. This automation of the training process is a significant advance, and has been made possible by the huge amounts of data and compute power that are available today.

It turns out that this methodology is not restricted to text. It can be used on any kind of structured data, including images, video, or computer code. Or data streams generated by industrial processes. Or the language of science: molecules translated into symbols.

Narrower focus

IBM is building large language models, but for particular applications rather than general purpose use, like ChatGPT. For instance it is building systems to specialise in organic chemistry, and in business. The weakness of general purpose systems is that they are shallow. They can answer most questions at high level, but if you go deeper, they get lost. More specialised machines can go deeper, and are less brittle. Being specialised often means you can obtain better quality data, and you can remove bias more easily.

One of the reasons why ChatGPT performs better than GPT-3 is Reinforcement Learning with Human Feedback (RLHF). OpenAI, which created these systems, hired large numbers of people to comment on the system’s output, and label biased or offensive passages accordingly. This does prompt the quip that AI stands not for artificial intelligence, but for affordable Indians, but the humans are being used during the training, not in operation.

IBM hopes to prove that it can develop a large model in a particular domain, which can then be trained on the proprietary data of client organisations within that domain. This would be a major improvement in terms of cost and sustainability over the old approach, which involved developing a new model for each application.

More efficient chips designs

Another field in which IBM is looking to improve the efficiency and sustainability of AI and computing is in chip design. Large language models are approaching the scale of computation that goes on inside the human brain, but they use the same energy as a small town, whereas the brain uses the same energy as a lightbulb.

Curioni says that IBM is taking three steps to reduce the power demand of advanced AI systems. The first step is neuromorphic chips, such as IBM’s True North, and Loihi, which are modelled more closely on human neurons than traditional chip designs. Their calculations are less precise, and more analogue.

The second step is memristors, where process and memory storage takes place on the same chip, which reduces the energy spent on retrieving and re-storing data in between calculations.

The third step is spiking neural networks, which transmit information only when their particular function is required, whereas in traditional chips, each neuron transmits information all the time.

Together, these three steps can confer two to four orders of magnitude improvements in energy efficiency.

Breakthrough in quantum computing

IBM may not currently be seen as the global leader in AI, but a field where it is generally acknowledged to be in the front rank is quantum computing, alongside Google and Microsoft. It has just announced a breakthrough in quantum cryptography which will enable data being transmitted today to remain secure, even when quantum computers are built that can break today’s encryptions. Quantum computers running Shor’s algorithm can factorise numbers efficiently, and when they scale they will be able to factorise very large numbers, which classical machines cannot do within reasonable amounts of time.

What IBM and a number of academic partners have done is to develop a new type of encryption called quantum safe crypto. It is based on high-dimension lattice cryptography, and it is believed that it cannot be broken by quantum computers. Over the last decade a large research programme was conducted to assess many potential types of quantum safe crypto, and last July, four algorithms emerged as the strongest. Three of those four were developed in Curioni’s lab in Zurich, and the winner has just been selected.

The next step is to migrate data from old forms of encryption to this new form. This task is becoming urgent. There was a scare in December 2022 when a team of Chinese researchers announced they had already worked out how to breach today’s encryption technologies. Their paper was nicknamed the “quantum apocalypse” paper. It was quickly realised that they were not all the way there, but it might not be long before someone does achieve it – maybe as soon as two or three years. The US government has directed that all its agencies must be quantum safe by 2025, and other governments and companies are doing the same. IBM’s breakthrough may have come just in time.

Related Posts