Home Tech The local government spends a small amount of money to build a...

The local government spends a small amount of money to build a “big super calculation”, and the experts are dumbfounded when asked

0
0

In the incident mentioned in the title, the people involved have no name or surname, and the location of the incident will not be observed. The exact date and month of the incident will not be examined. Any similarity is purely coincidental .

The origin of this incident is that some cities have made a wish for a “digital economy”, admiring the name of “smart cities”, and they want to build “urban brains” one after another. This is a good thing. Who doesn’t want to be a fan of the computing power economy? If the local government builds the platform down-to-earth and the manufacturer deploys the product honestly, there is nothing to write about. But there are some weird phenomena, which creased this pool of spring water. Attention, there is weirdness! For example, more than half a month ago, a city’s artificial intelligence computing center project was completed and put into use to provide services such as public computing power for the artificial intelligence (AI) industry. What a good thing! However, in the media reports on the computing power of the center, such a strange appearance appeared: “The first phase of the center’s construction scale is 100P FLOPS artificial intelligence computing power… Its computing power is equivalent to 50,000 high-performance computers.” XX release, investment XX, a little information XX, etc. “The computing power of artificial intelligence is equivalent to 50,000 high-performance computers”, which is typically misleading. Coincidentally. Two years ago, when a large-scale intelligent computing project was launched in a city, the computing power released to the outside was “100PFLOPS”, and the topic was forcibly added: “Domestic AI training clusters enter the field of supercomputing”… The AI ​​dedicated smart computer became an all-round supercomputer at once, wonderful! Speaking of dollar as a “knife” and intelligent computers as supercomputers, who is pretending to be confused? Sorry for joining the stage! Is it really all the media? not necessarily. Regardless of whose pot it is, but see that the unhealthy wind is getting longer. In some places, in order to quickly launch the computing power project, the computing power is not discriminated, as long as the number is beautiful enough. Perhaps the local area did not spend a particularly large amount of capital, and built the so-called tens of P, hundreds of P computing power, thinking that it can compete with the national supercomputing center and launch a wave of computing power to drive the economy. Unexpectedly, when they invited supercomputer experts to ask, it didn’t seem to be the case. Why? You get what you pay for computing power. In order to quickly launch the computing power project, the smart computer and the supercomputer are stupidly unclear, and they are even more confused about what is the reasoning performance and what is the training performance. As a result, I thought that I spent a small amount of money to build a world-class supercomputer. It may cost a lot of money to build a machine with only reasoning performance. What does that sentence say? “ The gift of all destiny , The price has already been secretly marked . ” Today we will talk about the mystery of this. Smart computer VS super computer, can’t tell the difference? First of all, supercomputing is supercomputing. Some machines dedicated to AI computing are also called supercomputing, which is questionable. The Linpack test, which is currently used in the industry to measure supercomputers, tests the “double-precision floating-point computing capabilities” of supercomputers, that is, the calculation of 64-bit floating-point numbers (FP64), which is a high-precision numerical calculation. Among the digital precision expressed in binary, there are single precision (32-bit, FP32), half-precision (16-bit, FP16), and integer types (such as INT8, INT4). The higher the number of digits, it means that people can reflect the changes of the two values ​​in a larger range of values, so as to achieve more accurate calculations. Unlike many scientific calculations, the computing power required by AI does not need to be too high-precision. For example, some AI applications need to process speech, pictures, or videos. Inference or training can be completed by running low-precision calculations or even integer calculations. This kind of special computer processing AI algorithm is fast and low energy consumption, which is determined by its characteristics. To sum up, we can distinguish as follows: Smart computer Is a kind of dedicated computing power, they Good at intelligent calculations such as reasoning or training , But because AI reasoning or training generally only uses single-precision or even half-precision calculations, and integer calculations, most intelligent computers do not have the ability to perform high-precision numerical calculations. This also limits its use in application scenarios other than AI computing. In contrast, supercomputer It is a kind of general computing power, and its design goal is Provide complete and complex computing capabilities They are more capable of high-precision computing and have a wider range of applications. For example, scientists often use supercomputers for planetary simulation, new material development, molecular drug design, genetic analysis, and other scientific calculations and big data processing. Chen Zuoning, an academician of the Chinese Academy of Engineering, once vividly used supercomputing for AI calculations “Big Horse-drawn Car” , To illustrate that although supercomputing is “decathlon”, it is not tailor-made for AI after all; in order to do things cheaply, intelligent computers have emerged. The fusion of AI and supercomputing, which was once hyped before, is also an improved high-performance computer “AI Specialization” Strictly speaking, they no longer belong to the category of supercomputing in our traditional discourse system. Nowadays, regardless of the completion of the supercomputing center or the intelligent computing center, they all claim the computing power “FLOPS”. In fact, this unit is “floating point computing power per second”, and the unit of some smart computers is actually “OPS” – ―The number of operations per second. If there is no distinction, it is easy for everyone to mistake it for the same calculation accuracy and the same calculation ability. This also leads to Some places think that they have spent a small amount of money to build the world’s top “big supercomputing”, which seems to have taken advantage of it; when the project was launched and introduced to the supercomputing industry, they were dumbfounded. . Sorry, it’s on stage again! Use new indicators to guide the healthy development of the industry To avoid misleading and chaos in the industry, the concepts of smart computers and supercomputers must be clearly distinguished. As mentioned above, there is another misleading in the industry, that is, the reasoning performance and training performance of fuzzy intelligent computers. Compared with inference, training performance often requires higher calculation accuracy, such as 32-bit or even 64-bit. Most AI chips with “dazzling” performance often refer to their inference performance, and may only be theoretical values. For AI computing, training performance is often more important-many intelligent models depend on this. If you want to draw a schematic diagram of the computing power required by AI, “inference” is at the bottom of the computing power matrix, because half-precision computing power (FP16) or integer computing power (such as INT8) can meet the inference needs. The top one is “training”, which generally requires the use of single-precision computing power (FP32) or half-precision computing power (FP16); the most demanding computing power is the brain-like “simulation”, which requires double-precision computing power. Both computing power (FP64) and low-precision computing power are supported at the same time. In order to achieve better guidance, A simple and effective indicator is needed to help judge the AI ​​computing power of the system and the development status of the entire high-performance AI field . November 2020, Researcher of the Institute of Computing Technology of the Chinese Academy of Sciences, Secretary General of the High Performance Computing Professional Committee of the Chinese Computer Society Zhang Yunquan Joint Professor of Tsinghua University Chen Wenguang , Researcher of Argonne National Laboratory Pavan Balaji And professor at the Swiss Federal Institute of Technology Zurich Torsten Hoefler , And the ACM SIGHPC China Committee jointly initiated the “International Artificial Intelligence Performance Computing Power 500 Ranking” (ie AIPerf500) based on the AIPerf large-scale AI computing power benchmark evaluation program. Those who are interested can look through this list, The computing power unit of the machines on the list is OPS . “Supercomputing and AI computing, one yard equals one yard, requires a new scale to guide the AI ​​computing industry on the path of healthy development.” Zhang Yunquan said. Can’t get around Nvidia?Domestic AI chips are waiting to be fought The computing power starts with the chip. On the AI ​​chip track, my country has chip design companies such as Huawei N Ten, Baidu Kunlun, and Suiyuan, but even so, domestic smart computers rarely bypass the American GPU giant Nvidia. This is a helpless reality: Nvidia is a tangible beneficiary of the many intelligent computing centers launched in China . AI chips that specialize in intelligent computing can achieve faster speeds and performance that is several orders of magnitude higher in low-precision calculations as long as the number of cores is large enough and the main frequency is high enough. However, if a computing cluster requires both high-precision calculations and low-precision calculations, the requirements for AI chips are high. This is also the killer of NVIDIA GPUs. Their computing power in various precisions is outstanding and very balanced (of course AI computing power is stronger), not to mention NV has a better software stack and application ecology. Sorry, it’s on stage again! This is one of the reasons why it is difficult for most domestic AI chips to compete head-to-head with NVIDIA GPUs. And if Nvidia finally successfully acquires ARM, it will be even more powerful and take off completely. However, domestically produced AI chips are not entirely without opportunities. first of all, At present, my country’s computing power infrastructure has a strong willingness to localize. Even if giants such as Nvidia and Intel take the lead, the tide of localization is still unstoppable due to various factors such as comprehensive cost and ecology. Secondly As far as the current development of AI is concerned, scenarios, data, models, and computing power are indispensable, which means that China will be a place where global AI computing power is rich in the future. As a core requirement, AI chips cannot be monopolized by one form or one ecology. The outstanding domestic AI chips such as Cambrian and N-Teng still have huge room for development. Furthermore Although the chip is the main source of computing power and the most fundamental material basis, the production, aggregation, scheduling, and release of computing power is a complete process that requires the cooperation of the software and hardware ecology of a complex system to achieve “effective computing power.” Therefore, we should not only pay attention to the single performance index of the chip, but also pay attention to the upper-level software application ecology. Is it impossible to develop AI without huge computing power? Finally, let’s talk about the development trend of AI. Behind the confusion of the concept of computing power is the soaring demand for computing power in AI computing like a wild horse. This is not surprising. The amount of AI training is determined by the amount of parameters. What is the scale of the parameters for training AI now? Hundreds of billions, trillions! OpenAI, an artificial intelligence non-profit organization jointly established by a number of Silicon Valley “tycoons”, launched its new generation of unsupervised translation language model GPT-3 in May 2020. It currently has 175 billion parameters and a training data volume of 45TB ( About 1 trillion words). The GPT-3 model has made major breakthroughs in semantic search, text generation, content understanding, and machine translation. Its greatest value is to verify the machine’s unsupervised self-learning ability, and to verify that performance can be improved purely by scaling up. The trillion-parameter model is already on the way. In early June, Beijing Zhiyuan Artificial Intelligence Research Institute released “Enlightenment 2.0”, claiming to reach 1.75 trillion parameters, surpassing the Switch Transformer previously released by Google, and becoming the world’s largest pre-training model. The rapid growth of the parameter volume also means higher computing requirements-some may require thousands of GPUs to provide the necessary computing power. For giant models like GPT, the demand for computing power is not a joke. Is it impossible to develop AI without huge computing power? Experts believe that in the current stage of AI development (perceptual intelligence and cognitive intelligence), computing power is still the first. In fact, the development of AI can be improved through computing power, or through the algorithm revolution, but at the current stage of “capital orientation”, Compared with the uncertainty of the algorithm model breakthrough, the increase of computing power is the easy choice. However, it must be pointed out that the use of huge computing power is not the only direction for the development of artificial intelligence. Giant models such as GPT-3 also have shortcomings, such as lack of common sense; and explore the mysterious mechanism of the human brain to realize small data learning and migration learning. It is also an important means. After all, the power consumption of the brain is only about 20W, and the realization of a smart system with low energy consumption may be a more important direction for efforts. (This article was written under the strong guidance of experts in the field of supercomputing such as Zhang Yunquan, Senior Commissioner of the Chinese Computer Society, and I would like to thank you!)