Online science dialogue in Spanish "El hardware de la Inteligencia Artificial: perspectivas de futuro".

Jesús del Álamo, Mario Lanza and Juncar Arbelaiz.

The Rafael del Pino Foundation, the Regional Ministry of Education, Universities, Science and Spokesperson of the Community of Madrid, the Ramón Areces Foundation, the Spanish Language Office, RAICEX and the Rafael del Pino Association of Excellence Scholarship Holders' Club of Scientists organised on 30 November at 6 p.m., a new edition of the Science Dialogues in Spanish which was broadcast via www.frdelpino.es/canalfrp

The event took place according to the following programme:

Keynote lecture:

The hardware of Artificial Intelligence: prospects for the future

Jesús del Álamo, Professor of Electrical Engineering at MIT.

He was joined at the round table by:

Mario LanzaAssociate Professor of Materials Science and Engineering at King Abdullah University of Science and Technology.

Moderator: Juncal ArbelaizD. in Applied Mathematics from MIT and Schmidt Science Postdoctoral Fellow at Princeton University. Rafael del Pino Scientists Club.

Jesús del ÁlalamoHe is a graduate of the Polytechnic University of Madrid (Telecommunications Engineering, 1980) and Stanford University (MS EE, 1983 and PhD EE, 1985). From 1977 to 1981 he was at the Solar Energy Institute of the Polytechnic University of Madrid, researching silicon photovoltaics. From 1981 to 1985, he did his PhD thesis at Stanford University on the transport of minority carriers in heavily doped silicon. From 1985 to 1988, he was a research engineer at NTT LSI Laboratories in Atsugi, Japan, where he conducted research on III-V heterostructure field-effect transistors. He joined MIT in 1988.

Mario LanzaProfessor Lanza's research focuses on the integration of two-dimensional (2D) materials into solid-state nano/microelectronic devices and circuits, with special emphasis on memristive crossbar arrays and their use to build artificial neural networks. He makes special efforts to use industry-compatible fabrication methods and to characterise the performance and variability of large numbers of devices in a statistical way. He is one of the world leaders in the study of hexagonal boron nitride (h-BN) as well as many other ultra-thin dielectrics for electronic devices (SiO2, HfO2, Al2O3). He is renowned for his expertise in the nanoelectronic characterisation of multiple materials and devices using scanning probe microscopy (SPM).

Juncal ArbelaizShe obtained her bachelor's degree (2014) and master's degree (2016) in Industrial Technologies Engineering from the University of Navarra, being a Santander Bank Excellence Scholarship Holder. After graduating at the top of her class, she received the Kutxa Banking Foundation's End of Degree Award, the Extraordinary End of Degree Award from the University of Navarra and the National End of Degree Award from the Spanish Government. His interest in the use of advanced analytical and computational techniques in the field of operations research led him to start in September 2016 his PhD in Applied Mathematics at the Massachusetts Institute of Technology (M.I.T.), in Cambridge, USA. His PhD thesis focuses on the design of algorithms for control, optimisation and decision making in complex, autonomous and large-scale dynamic systems and operations. During his doctoral studies, he received grants from several companies and foundations, including the M.I.T. Presidential Fellowship (sponsored by Akamai Technologies), the La Caixa Foundation Graduate Fellowship and the Rafael del Pino Foundation Excellence Grant. While studying engineering, he quickly realised the enormous potential of advanced analytical methods, algorithms and computation to improve our daily lives. In 2022, he received the prestigious Schmidt Science Fellowship at Princeton University to study biological intelligence and then reproduce it in robots, a postdoctoral fellowship that only 29 people in the world have received.

Summary:

The Rafael del Pino Foundation, the Regional Ministry of Education, Universities, Science and Spokesperson of the Community of Madrid, the Ramón Areces Foundation, the Spanish Office, RAICEX and the Club de Científicos de la Asociación de Becarios de Excelencia Rafael del Pino organised, on 30 November, a new edition of the Diálogos de Ciencia en Español, dedicated to "El hardware de la inteligencia artificial: perspectivas de futuro" (The hardware of artificial intelligence: future perspectives). The event was attended by Jesús del Álamo, Professor of Electrical Engineering at MIT, and Mario Lanza, Associate Professor of Materials Science and Engineering at King Abdullah University of Science and Technology.

The event began with a talk by Jesús del Álamo, who pointed out that there are four pillars on which the recent explosion of artificial intelligence is based. They are algorithms, the huge databases that have been created, human talent and improved hardware.

Until 2012, the processing power of computers dedicated to artificial intelligence was doubling every year. This is the progress of Moore's Law, with which all these technologies are implemented. But since 2012, the growth has accelerated dramatically and now the processing capacity doubles every two months. What happened in 2012 is that the GPU, or graphics processing unit, was introduced in artificial intelligence computing systems. This point of disruption shows the importance of using hardware that has been designed specifically for artificial intelligence.

If we look in more detail over the last few years, the dollar cost of programming these computers, which is the cost of the energy they consume, is approaching ten million dollars. This cost is also growing exponentially.

Programming the GPT-3, which is a dedicated language processing model, cost around ten million dollars. Alphago cost around one million dollars, in 40 days' electricity expenditure, which is equivalent to the emission of 97 tonnes of CO2 into the atmosphere, the consumption of 23 US households for a year or a thousand hours of air travel.

The problem is that, in order to program such complex models, a processor with high computing power is needed, which consumes a lot of electrical energy. There is a relationship between computing power and energy consumption.

The Cerebras WSE-2 is a processor of enormous size, with 2.6 trillion transistors and 40Gb of memory. It is an extraordinary chip that is used in supercomputers dedicated to training artificial intelligence models.

The trend is a trajectory towards saturation, which means that in order to get more capable models that can process much more data, we are going to have to be willing to consume super-linearly increasing amounts of energy. This is a big problem that is going to prevent us from going any further at any given time. This is the point at which new innovation is going to be needed. This is one of the big problems we have today. Artificial intelligence can do great things for humans, but we have to be careful with the planet.

What all these systems have in common is that they are based on the same conventional CMOS technology, based on silicon, which limits computing efficiency to around ten to twelve operations per watt. What we want is to increase the computing power while decreasing the energy consumed, which requires a real technological revolution. There are two paths being investigated these days to move in that direction. The first is to solve the memory bottleneck; the second is to invent new models of computing.

The bottleneck refers to the fact that, in modern computer systems, the chips responsible for computation and those responsible for memory are different and are assembled on the same board at some distance from each other. The reason is that the technology to produce one and the other is very different, which means that data has to be transferred from memory to processor and back to memory. In artificial intelligence applications, where large databases are used and there is a lot of data transfer, a lot of energy is consumed and a lot of time is required to do this data transfer. This is what the bottleneck refers to. The solution is to integrate memory and computation in a very intimate way, with very short and very thin connections, so that data can be transferred very quickly and with very little energy expenditure. The problem is that the basic technologies behind computation and memory are very different.

One possibility being investigated these days to solve the problem is what is known as "embedded non-volatile memory", which would make it possible to create a high density of databases that can communicate quickly with the computational part. A second line, which is emerging these days to integrate systems in a more intimate way, where parasitic effects are reduced, is what is called "heterogeneous integration of chiplets", which are small chips. If we want to do it now, communications is forced to do it with the same technology because everything is on a chip. As computing is the priority function, this forces you to take a very advanced technology and all the other functions end up being implemented with that very advanced technology, which is often not necessary or desirable to implement those other functions. The end result is a very expensive chip that takes a long time to do things.

The new concept is an embedded chip, which consists of the integration of small chips, or chiplets, each dedicated to implementing a specific function in a highly optimised way. They are designed and manufactured with the optimal technology for the function that each chiplet has to perform. Some of them are already manufactured and can be chosen from a catalogue. If such a system is assembled in a modular way, the result is a design that gets to market much faster, is more economical and has better energy efficiency. The challenge is to find the technology that allows us to connect these chips in a very intimate way, with very dense connections, with very few parasitics that require very little power.

The latter type of chiplet already exists on the market, e.g. the Apple M1 Ultra, which consists of two M1 Max microcomputers joined together. Putting them together results in twice the computing power. This chip contains a total of 114 billion transistors. The key is the communication between these two chips. The advantage of doing it this way is that the performance is much higher and the cost much cheaper.

Another example is Ponte Vecchio, which is Intel's artificial intelligence accelerator, consisting of an assembly of forty-seven chips, totalling 110 trillion transistors.

This is a truly revolutionary technology, which is going to demand a lot of innovation in many fields, where there are great opportunities in these disciplines, such as materials, design tools, thermal aspects, ...

The other major line of research is new computing paradigms. One of them is the implementation of artificial neural networks, but in analogue, not digital form. In the analogue field, the advantage is that the basic operations behind matrix multiplication, which is the key operation of neural networks, consists of multiplication and addition, which can be done by exploiting Kirchoff's Law and Ohm's Law. That way it can be done very efficiently, with devices that are very small and with much lower power consumption. The problem is that, at the moment, such a device does not exist.

Several lines of research are being pursued to address this problem. One is to intercalate ions in a controlled manner into a metal oxide channel to control conductivity. Another is an application that involves creating conductive filaments within a metal oxide matrix. Other lines are trying to exploit the concept of the ferroelectric effect, two-dimensional materials to implement neural networks or passive linear optics.

This was followed by a dialogue between Jesús del Amo and Mario Lanza. First, they talked about neuromorphic computing. According to Jesús del Amo, it involves mimicking how synapses work, the connections between neurons in the brain, which can be strengthened or weakened based on pulses reaching both sides of the neuron. People working on these issues have some scepticism about whether computation can be done using the synchronism of pulses arriving at these synapses.

Mario Lanza said that there are two sides to this field. One is to try to copy nature, but this is a bit complex because not even biologists understand very well how the brain works. The other is more pragmatic and is to understand this computation as a set of mathematical algorithms. This is where the whole issue of matrix multiplication comes in, which is a key operation for performing these algorithms. The advantages are going to be quite a lot in terms of speed and lower power consumption. The biggest challenge will be in the development of this type of circuit, not in the operation itself, but in the robustness of the transistor, which is a device that has withstood miniaturisation, the introduction of new materials, etc. Making these circuits as reliable as the transistor will be the most difficult thing.

When it comes to industry-university collaboration, Jesús del Amo sees many advantages in it. Engineers not only want to work on something new, but also want it to be relevant, to improve society, to improve the world. That is where industry provides fundamental guidance on what problems are relevant. You have to think about these relationships internationally, not at the country level, especially in Spain, where you have to think in a European way and be part of the European microelectronics ecosystem, being able to enter into collaboration with other universities and companies.

In Mario Lanza's case, he recalled that many universities are looking for such collaboration, but it is more difficult to find, especially in the case of modern microchips, which is the most sophisticated product ever made. So scaling up and commercialising something is easier with software than with hardware, where special laboratories are needed. What you try to do at university is to go and do what industry will be doing in ten years' time. There is a niche here where an engineer can have a real chance to make an impact. That's where the opportunity is. But it's a difficult terrain because no one can compete with industry. So it's about solving issues that industry can't get to.

On the opportunities for Spain in the field of microelectronics, Jesús del Amo sees many opportunities, but believes that the focus should be on Europe and not Spain, because the microelectronics world has split into four parts: the United States, Europe, China and Taiwan and Japan. Contributing at the country level is very difficult, but if you are part of a very rich European ecosystem, there are fantastic opportunities to build. When you talk about re-launching microchip manufacturing, one of the big bottlenecks is the number of graduates, of skilled engineers who can push this industry forward. There is a worldwide shortage. Spain has a very strong engineering education and can make a very important contribution by preparing very well-trained engineers who can contribute to the whole world to push this industry forward. Thinking about how to strengthen engineering education in Spain should be a high priority. All classes should be in English, all professors should have international and industry experience and there should be many collaborative programmes with industry. The other point of view is that Spain is a very attractive country to attract international talent. Today, we have learned to work online with teams dispersed all over the world. That means that there are possibilities to set up design and research centres and, with that special attraction that Spain has, it can bring very interesting activities to Spain.

Mario Lanza recalled that, in the United States, semiconductor education is getting very serious. There are many talented people who we have to try to attract where they are useful. In Spain, there are many young people who do not know what is out there. The experience of going and progressing abroad can be very good. Moving around opens many doors so that you can progress faster.

The Rafael del Pino Foundation is not responsible for the comments, opinions or statements made by the people who participate in its activities and which are expressed as a result of their inalienable right to freedom of expression and under their sole responsibility. The contents included in the summary of this conference, written for the Rafael del Pino Foundation by Professor Emilio González, are the result of the debates held at the meeting held for this purpose at the Foundation and are the responsibility of the authors.

The Rafael del Pino Foundation is not responsible for any comments, opinions or statements made by third parties. In this respect, the FRP is not obliged to monitor the views expressed by such third parties who participate in its activities and which are expressed as a result of their inalienable right to freedom of expression and under their own responsibility. The contents included in the summary of this conference, written for the Rafael del Pino Foundation by Professor Emilio J. González, are the result of the discussions that took place during the conference organised for this purpose at the Foundation and are the sole responsibility of its authors.

en_GB