The Yuan requests your support! Our content will now be available free of charge for all registered subscribers, consistent with our mission to make AI a human commons accessible to all. We are therefore requesting donations from our readers so we may continue bringing you insightful reportage of this awesome technology that is sweeping the world. Donate now
Characters, data, stable power may give China the edge over US in AI stakes
By Xin Zhou, Ben Armour  |  May 23, 2024
Characters, data, stable power may give China the edge over US in AI stakes
Image courtesy of and under license from
China and the US are locked head-to-head in a struggle for primacy in global AI. Many factors will come to bear in deciding the outcome, but China’s intrinsic advantages in data, energy, and the nature of the Chinese language itself may prove decisive, two The Yuan editors argue.

HONG KONG/LONDON - China and the United States are unquestioned top dogs in the global artificial intelligence (AI) arena, with the scrap between them growing ever-fiercer by the day. Seemingly innocuous forces are, however, quietly tipping the scales. The idiosyncrasies of the Chinese language and China’s edge in data and unlimited, stable power resources are slowly but steadily forging the sword for the country’s eventual triumph in the AI melee. 

These two elements - the one an heirloom of cultural wisdom, with the other two forming a fertile bed to spur the growth of AI - are undoubtedly mighty weapons for China to wield to win the future AI contest.

China’s data hoard: unearthing cultural gems

China has the world’s largest cohort of internet users, and the data traffic they generate each day is a river in full spate, much like the vast streams whose hydropower makes up almost one-fifth (and counting) of total national energy generation. Together these rich sources - endless data and boundless power - present a veritable cornucopia to nourish AI development. The richness and complexity of Chinese information provide a huge space for AI learning unrivaled in its length and breadth. 

Chinese, an antediluvian language with profound cultural connotations and a unique evolutionary trajectory, also holds signal advantages in natural language processing:

“Text normalization is a method for standardizing text to prepare it for the tokenization, vectorization and classification steps. With [English], the first step would be to convert all text to lowercase. Because Chinese characters are not capitalized to begin with, there’s no need for that data cleaning step. Next comes stemming or lemmatization.Compared to English, there is also no concept of a stem in Chinese. Therefor

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.
Continue reading
Sign up now to read this story for free.
- or -
Continue with Linkedin Continue with Google
Share your thoughts.
The Yuan wants to hear your voice. We welcome your on-topic commentary, critique, and expertise. All comments are moderated for civility.