Home Tech Original Interview with Professor Lu Zhiwu of Renmin University of China: A...

Original Interview with Professor Lu Zhiwu of Renmin University of China: A new breakthrough in AI, or from “Wen Lan”


I propose to consider the question,’Can machines think?’


In the autumn of 1950, Alan Turing, who was hailed as the “Father of Artificial Intelligence” by later generations, in the opening chapter of his book “Computing Machinery and Intelligence”, tossed out this seems to be “ Whimsical “The problem. Also in this article, Turing proposed a concept that appeared even earlier than “Artificial Intelligence (AI)”—— Turing test . This opened the prelude to human beings’ arduous exploration in the field of AI.

Time flies, and the stars change. Over the past 70 years, AI, which has experienced three waves of development, is quietly entering the daily lives of our ordinary people with various faces. Facial recognition, assisted driving, intelligent medical imaging and other applications are gradually becoming part of the “usual” part of human society, thanks to the continuous maturity of AI technology . And behind this is the continuous investment in AI research and development by academia, industry and even various countries. At present, the research and development of AI in countries around the world has gradually become a “race” trend. But to truly pass the “Turing Test”, no country has yet to be able to do so.

On June 1, at the 2021 Beijing Zhiyuan Conference, the ultra-large-scale intelligent model “Enlightenment 2.0” was officially released. With its 1.75 trillion parameters, it set the world’s largest pre-trained language model record and successfully demonstrated China’s AI technical strength to the world. It is reported that “Enlightenment 2.0” consists of four pre-training models of Wenyuan, Wenlan, Wenhui, and Wenshuo.

Among this, “Wenlan” is good at semantic understanding, visual-language retrieval, etc. Aroused great interest from Meike.com. It is reported that, Wenlan’s strong understanding of semantic information has reached the world’s “leading” level, which can be regarded as a breakthrough in the world’s AI field.Its capabilities are highly scalable and can be applied in a variety of scenarios . Through the research of Wenlan, human beings and the seemingly out of reach of the “Turing Test” are one step closer. The Wenlan R&D team is led by Professor Wen Jirong, Executive Dean of the Gaoling School of Artificial Intelligence at Renmin University of China, and works closely with Beijing Zhiyuan Artificial Intelligence Research Institute.

Picture | Professor Lu Zhiwu

Through hard work, we are honored to interview the head of the model group of the Wenlan R&D team—— Professor Lu Zhiwu from Gaoling School of Artificial Intelligence, Renmin University of China , And chatted with him about the future of AI and the story behind Wenlan.

AI development is gradually entering a bottleneck, and the multi-modal pre-training model led by Wenlan may become the key to “breaking the game”

As we all know, the ultimate goal of artificial intelligence is Let machines have the same understanding and thinking ability as humans . But more than 70 years have passed, and there is still a long way to go before this goal.

As far as the academic community is concerned, although many AI technologies can have a positive impact on human life, from the perspective of the general trend, AI research and development seems to be reaching a “ bottleneck “. Both the academic circle and the industry need to discover a new “hot spot” in technology to stimulate the entire AI industry to continue to develop by leaps and bounds.

In this case, “Wen Lan” was born.

Professor Lu Zhiwu told reporters, “ In the end, any AI model is actually a neural network. In the past, the industry often used pure text or pure image mode for unimodal training of AI.But now it seems that its effect is not particularly effective . ”

With academic development, The Wenlan team began to focus on pre-training using graphic data pairs at the same time, hoping to tap the new potential of AI . But before this, there has been no successful case in this direction.

In order to obtain better results, the training data used by Wenlan 1.0 and 2.0 Upgraded from 30 million to 650 million unlabeled graphic data. The huge amount of data is very difficult for model training, but it also lays the foundation for Wenlan to have strong visual-language retrieval capabilities and certain common sense understanding capabilities.

In terms of training methods, the Wenlan R&D team adopted an efficient distributed multi-modal pre-training framework and proposed a multi-modal pre-training algorithm based on DeepSpeed ​​to maximize the use of GPU and CPU and optimally support cross-modality Contrast learning.

At present, Google and OpenAI, the top foreign AI R&D institutions, are also trying the research direction of the Wenlan team. The project names are Google ALIGN and OpenAI CLIP respectively. Lan is obviously better, so to speak, At present, Wenlan has reached the top level in the world in terms of mutual inspection of pictures and texts and semantic understanding.

So, where can Wenlan be applied? Professor Lu Zhiwu told reporters that the current Wenlan, like a “brain”, is highly adaptable and can be used in multiple scenarios. Taking one of its strong “retrieval and recommendation” capabilities as an example, Wenlan can be “handy” in the common business scenarios of multiple sub-industries in e-commerce, games, and video.

Professor Lu Zhiwu said “ If the AI ​​we have learned in the past is only a child, now Wenlan is more and more likely to be close to an adult “.

Explore AI “subconsciousness”, “Turing Test” can see more dawn

Wenlan’s ability is beyond doubt. But for the Wenlan development team, After training on massive graphic data, does Wenlan really learn semantic information, and how strong is Wenlan’s understanding ability? , Has become a very attractive question.

For this reason, Wenlan’s R&D team decided to test Wenlan by means of “neuron visualization”. You can simply understand this as a “ Propositional painting “The test. We told Wenlan a meaningful sentence, let Wenlan feedback her understanding of this sentence in the form of a picture.

But please note that the picture feedback here is by no means matching the optimal solution from Wenlan’s existing picture data, nor is it an imitation of specific training data like some AI painting models.

At this time, Wenlan is more like an “ordinary person”. With the help of his own knowledge, he tries to understand the new information passed in from the outside world, and uses the form of pictures to “concrete” his own understanding. What the reaction is The objective existence in Wenlan’s “mind”.

Teacher Lu Zhiwu said “ (In this way) We visualize Wenlan’s “subconscious”, which is the most primitive imagination and understanding of a sentence in her mind . ”

How did Wenlan paint it? Simply put, we all know that on a computer, a picture is composed of pixels. By changing the color of each pixel, you can paint on the computer. And Wenlan, who got the text information, used this method to “original painting”, to express her understanding of the meaning of the sentences we gave with pictures. At this time, Wenlan can be compared to a balance. The two ends of the balance are images and texts. What Wenlan has to do is to make the meanings of the two “remain equal.” . It is worth noting that when visualizing neurons, all Wenlan model parameters are fixed and unchanged, only to modify the input initial noise image.

The Wenlan R&D team said: “ In this way, we can get a glimpse of Wenlan’s “inner world”. That is to let go of all evaluation and application restrictions on Wenlan, so that she can show the most primitive and true, unique understanding of the input text in her “subconscious”.

At present, according to Wenlan’s “paintings”, his ability to understand semantics has ranked among the top in the world. In addition to everyday language, Wenlan can also understand ancient poems and can even convey a certain “artistic conception”.

The following are some examples of Wenlan’s actual test (provided by Wenlan R&D team):

Statement to Wenlan: Make a wish on the birthday cake

(Interpretation: The image of the cake is very clear, there is also a candle, and there are dots and dots on the cake. The whole is a cheerful atmosphere for a birthday party.)

A sentence to Wenlan: The day is full of mountains, the Yellow River flows into the sea

(Interpretation: The mountain peaks in the distance obscured the setting sun but did not obscure the afterglow, and the nearby mountains ran towards us like a yellow river.)

Statement to Wenlan: The moon is falling and the sky is full of frost, Jiang Feng Yuhuo is sorrowful

(Interpretation: the red fire on the river, the awning boat nearby.)

Statement to Wenlan: You can pick lotus in the south of the Yangtze River, and He Tiantian with lotus leaves

(Interpretation: the budding lotus on the upper left, the lotus in the middle on the right, the lotus leaves and the whole green.)

Statement to Wenlan: The bright moon is born on the sea, and the end of the world is at this time

(Interpretation: The wavy sea below, and the rising moon on the sea. Although the original meaning of the verse is a full moon, it does not literally express the meaning of a full moon. The large abstraction of the background may be Wenlan’s interpretation of “the end of the world is at this time” Understanding.)

Foresight and persistence make Wenlan “come out of the sky”, diversification and intersection will become the new starting point of the AI ​​wave

For scientific research, correct judgment and persistence are sometimes more important than diligence and hard work. When talking about Wenlan’s research and development process, Professor Lu was deeply moved by it.

Since September last year, the Wenlan team has been working on multi-modal pre-training. Thinking of the process at that time, Professor Lu described: “ It was completely groping in the dark, and the multi-modal pre-training model was very difficult to do, but I still walked down this road decisively (weak correlation between graphics and text + two-tower model).

But exploration and persistence are risky. During this period, Professor Lu and his PhD students devoted themselves to this project, and therefore did not publish a paper for a long time. If the direction is wrong, or the model is not trained well, it will be a “grainless” result. The pressure on the entire team can be imagined.

At almost the same time, the front runners in the foreign AI industry: Google and OpenAI, are also doing similar things. And in January of this year, OpenAI released two models similar in direction to Wenlan: DALL-E and CLIP. While shocking the industry, it also proved that the selection of Professor Lu’s team was correct and forward-looking.

However, from the perspective of academic research in domestic universities, universities such as Qingbei seem to have more advantages in AI. Why is the National People’s Congress making a breakthrough in the AI ​​field this time?

Professor Lu Zhiwu believes that the advantage of the National People’s Congress lies in its liberal academic atmosphere and rich cultural ideas.

Dean Wen Jirong of Hillhouse College of Artificial Intelligence is very supportive of these valuable explorations.So our overall academic atmosphere is still very relaxed and open . ”

In addition, as a higher education institution specializing in humanities and social sciences, the National People’s Congress has its own unique way of understanding AI. In a sense, in view of the prevailing instrumental rationality, the National People’s Congress tends to be more value rational. This is also one of the reasons why Wenlan’s team was able to take the risk of “no revenue for particles” and insist on completing the research.

In the view of Meike.com, In addition to the unique advantages of the National People’s Congress, Wenlan’s success is inseparable from the forward-looking AI development possessed by Professor Lu Zhiwu, and the excellent capabilities of the entire Wenlan R&D team. .

On the long journey of AI exploration, “breaking” and “standing” are eternal topics. Although Wenlan has made breakthrough achievements, Professor Lu Zhiwu still modestly said that, on the whole, the future development of AI still requires the joint progress of related interdisciplinary subjects such as brain science and neuroscience. However, if the road is hindered and long, the line will come. we believe, Driven by the successful case of Wenlan, more “Wenlan” can emerge in China in the future, so that it can take off the “Turing Test”, the crown of AI, in one step faster. .