• Tue. Oct 8th, 2024

Rapid progress in AI research and development faces hurdles

Rapid progress in AI research and development faces hurdles

CHINA

bookmark

Major Chinese tech companies last week announced extensive price cuts for their large language model (LLM) products used for generative artificial intelligence, a move that experts say is tipped to speed up the application of AI models in the domestic market and in research.

While price wars are not uncommon in the world of LLMs, this comes as Chinese developers continue to make progress in the commercialisation of AI technology in recent months.

The latest statistics from China’s Cyberspace Administration show that, as of March this year, a total of 117 generative AI services were registered in China. Currently, the number of LLMs is estimated to exceed 200, of which over 100 are with more than one billion parameters (the variables a model learns during training).

But some say the country still has some catching up to do before it can boast world-class LLMs, as its AI industry grapples with bottlenecks including in computing power and other issues such as localisation, and the AI ecosystem.

Other experts have pointed out flaws in research and development (R&D), with research efforts scattered rather than unified across different entities, including universities, research institutes and technology companies.

“Although outstanding young scientists in the field of artificial intelligence in top universities have the potential to be world-class players, they suffer from a lack of computing power resources and cannot fully participate in the mainstream research of large models,” said Wang Yanfeng, deputy dean of the School of Artificial Intelligence at Shanghai Jiao Tong University.

Wang recently led a group of 15 Shanghai city representatives in presenting a proposal on LLM innovation to the Shanghai municipal government, calling for “strategic adjustments” in the way R&D is done. They proposed accelerating transformation of research, strengthening industry-academia cooperation, and integrating key high quality R&D elements.

China could also learn from the United States, Wang noted, where scientists move freely between the industry-academia ‘revolving door’ to solve scientific problems, which also has immediate application for society and real-world situations, including the need for applications geared to Chinese contexts.

Need for localised models

While many industry players have adopted localisation strategies, training and providing LLMs customised for Chinese users, this is hampered by a shortage of Chinese text to train the models.

With OpenAI’s ChatGPT, the proportion of Chinese-language data is less than 0.01% of data, while English-language data accounts for more than 92.6%. Recent studies have found that limited Chinese data for training means that models developed in the West may be lacking in exposure to Chinese concepts and terminology, limiting their application in fields such as traditional Chinese medicine, for example.

Constrained by the scarcity of high quality Chinese resources, many research institutions and companies rely on foreign language annotated datasets, open source datasets, or crawled network data for training LLMs.

Filling the data gap

Lin Yonghua, deputy director and chief engineer of the Beijing Academy of Artificial Intelligence (BAAI), a leading non-profit in AI research and development, said the “bottleneck” around training data is looming large.

“The research organisation Epoch AI believes the demand for data has increased so dramatically that high quality text available for training may be exhausted by 2026,” Lin said in a speech on 24 May.

Government-sponsored entities like BAAI are spearheading efforts to fill the widening gap.

On 26 April, the Academy officially released the Chinese Corpora Internet (CCI) 2.0, following its first version made public in November last year. The platform is about 500GB in size and covers 125 million web pages, the result of cleaning, filtering, and processing eight terabytes of raw data, according to the Academy.

BAAI has also proposed some solutions to increasing the quantity and quality of Chinese-language training data. Apart from facilitating download of 2.4 terabytes of open source datasets, it provides incentives to institutions to share data, which includes the creation of a “co-building and co-sharing” platform.

The platform allows institutions to earn points, which can be exchanged, based on the quantity and quality of datasets they contribute or download.

The academy also stressed the importance of data security, saying the processing and training of high quality, copyrighted data should be in the same security domain as the computing platform, and that the use of data must be strictly “controllable”.

The GPU bottleneck

Industry discussions also focus on the strong demand for computing power. Lack of computing power inhibits the implementation of LLMs and delays product deployment, they say.

According to data from GPU Utils, which tracks the global market for computer graphic processing units (GPUs) powered by high-performance chips, the shortfall in global supply of high end chips produced by US company NVIDIA that are required to power LLM models has reached 430,000 chips.

Apart from buying and hoarding NVIDIA chips, companies and research institutions say more needs to be done to address the shortage, in part caused by US restrictions on sales to China.

In addition, the overall utilisation rate of some computing power centres in China is less than 30%, resulting in a serious waste of resources and a prominent conflict between supply and demand.

A new startup backed by Tsinghua University in Beijing is seeking to fill the widening gap in computing power.

In March Wuwen Xinqiong, an AI infrastructure company, released its Infini-AI large model development and service platform, which uses GPU inference and acceleration technology to provide a complete tool chain for LLM development, training, operation and application.

Initiated by Professor Wang Yu, dean of the department of electronic engineering at Tsinghua, and founded by his PhD student Xia Lixue, Wuwen Xinqiong’s aim is to address the mismatch between GPU demand and supply.

The platform aims to effectively integrate and optimise computing resources, hoping to alleviate the computing power shortage for LLM enterprises, the company said. A year after its founding, the company is now backed by tech heavyweights including Baidu, Alibaba and Tencent.

Wuwen Xinqiong occupies the middle ground between LLMs and chips, according to Xia. “Wuwen Xinqiong can be seen as equivalent to Taobao [the online shopping platform] in the field of large model computing power. Downstream LLM manufacturers and application developers can purchase useful and efficient computing power with just one click,” he told Chinese tech media in April.

However, Xia said US moves to curb China’s AI sector meant anxiety over computing power may be here to stay.

In January, US Commerce Secretary Gina Raimondo said Washington had begun the process of asking American cloud companies to report foreign entities who use their computing power to train LLMs. Although no countries were named, it is widely believed to be targeting China.

Nonetheless, LLMs are predicted to enter a stage of rapid development in China, fuelled by strong backing from the government at all levels, a surge in user demand, and the push of tech companies and research institutions.

“More and more innovative AI application scenarios and product forms are expected to be seen in 2024,” said Wu Hequan, an academician at the Chinese Academy of Engineering.

link

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *