China’s Machine Translation Robots Have Become English Teachers

The increasing reliance on machine translation in the early stages of the English education process is one of the most disturbing recent trends to emerge in the training of Chinese-English translators. Even at young ages, students are turning to machine translation to try and understand texts, consequently memorizing machine-generated English. Today, “translation pens” that allow you to highlight a text and receive an instant translation are also readily available. Translators looking for a way to use a specific Chinese expression in English are now learning how to do so directly from robots.

One of the dark secrets of neural machine translation relevant to Chinese-English translation is the utter distrust of neural machine translation found among major technology company employees in China. This is not unlike scientists working at a tobacco company being totally unwilling to use tobacco. These big Chinese technology companies are some of the biggest proponents of human translation, especially high-quality native English translation, and were even the first adopters of American standards for high-quality translation into English. Take, for instance, ByteDance, a company that relies on its AI technology to deliver suitable recommendations and write news articles on Toutiao to generate revenue. Now, consider that the company also has some of the best-translated copy and just about the best international performance of any Chinese company. This seems like a paradox: a company specializing in artificial intelligence is one of the most insistent on using high-quality, native English translation and copy for its prominent, high-traffic, or high-risk content.

Typically, machine translation power users are those companies with little or no understanding of artificial intelligence, as indicated by the many Chinese financial and legal firms adorning their web pages and podcasts with translations provided by none other than Google Translate. To understand why artificial intelligence experts have such a strong pro-human translation policy, you need to know a thing or two about machine translation.

How Machine Translation Works

Leveraging deep learning models, Neural Machine Translation (NMT) captures and replicates subtle nuances in language that other systems would likely miss. This results in translations that are not only more accurate and fluent but also more natural-sounding than those generated by alternative systems. For businesses, NMT is faster and more efficient than human translation, requiring significantly less data to produce accurate translations. Moreover, users can employ deep learning to train and fine-tune NMT models to produce accurate and natural-sounding translations on a large scale, requiring less human intervention and saving businesses money on hiring and training translators.

Deep learning, itself a subset of machine learning, uses artificial neural networks to learn from data. Neural networks are modeled after and designed to work like a human brain, composed of interconnected nodes used to process input data and generate output, and each neural network is made up of several layers that process data. The first layer captures input data, and the subsequent layers “learn” to detect patterns and make predictions. Training these layers on extensive datasets improves their accuracy and makes them more efficient at recognizing patterns and generating accurate translations.

At this point, the Chinese-English human translator should take a moment to pause and consider what the technology here can’t do: it can’t identify what the text it’s translating is actually about. All it can do is apply rules and recognize patterns in languages.

NMT works by first analyzing the input data and then generating a translation. The translation is then evaluated for accuracy and fluency, and the model is then adjusted accordingly. This process is then repeated until the model can generate accurate and natural-sounding translations. In practice, leading neural machine translation companies hire freelance translators for about $40/hour to select the most appropriate translation from among several possible suggestions produced by the system. The AI then labels the human-selected best choice as the preferred solution generated by the neural network.

Data Labeling is a Huge NMT Weakness

Data labeling is a process used to train artificial intelligence systems to recognize patterns in data sets. It’s essentially the process of assigning labels or tags to data points so that AI systems, including NMT systems, can learn to identify patterns in data. Thus, labels are applied to NMT outcomes to help predict future human translator decisions, making it a critical step in training NMT systems.

The more accurately labeled data sets are, the more accurate and reliable the NMT system’s predictions and decisions will be. This particular point is a huge problem for neural machine translation because high-accuracy data labeling is incredibly expensive. Therefore, NMT engines are instead relying on substandard labeling to produce results that are technically passable, despite being inaccurate.

A good case study is where an Amazon manager was developing the company’s machine translation engine and decided to hire OneHourTranslation to work with them. OHT, however, is one of a tiny handful of companies sanctioned by the ATA for ethics violations, so it raised a big question: Why was this MT creator partnering with people who have ethics sanctions against them to promote their MT engine? The reason MT companies flock to ethics-sanctioned companies is simply that these dishonest companies are known for producing MT-like results despite claiming to hire humans, making it appear as if the MT is as good as human quality. What they did here was basically find a way to outsmart the machine translation companies rather than accept faked answers from them. This is a slight improvement over what used to happen: Google Translate initially learned by aligning translated versions of websites and using existing translations found across the web to develop a predictive model of how to translate texts.

Thus, I’ve never heard anyone in the Chinese-English translation industry say that Amazon AWS has a good translation engine. They are obviously using $5 translators. On the other hand, I hear a lot of good comments about DeepL being an effective engine, and from what I understand DeepL is using the $40/hour translators. The bottom line is that the very best machine translation engines are essentially just repackaging human translations as adaptable machine translations. To replace human translators without making the leap to artificial general intelligence, we’d need to increase the number of ATA-certified level translators to about 500 times their current numbers and have them do nothing but AI data labeling. That is to say, there are not enough qualified human translators to develop an effective AI system without resorting to artificial general intelligence.

 What is Artificial General Intelligence?

So far, we know that machine translation cannot replace human translators. AI lacks human sentience, and Google—developers of the AI-like Google Translate—has a policy forbidding any of its people from claiming its AI has that ability. Such an ability, nonetheless, would be required for any machine translation system to function reliably.

Today, we’re used to the narrow AI used in applications like autonomous vehicles, facial recognition, and natural language processing. Artificial general intelligence (AGI)’s appeal, however, is its potential to move beyond the capabilities of narrow AI and create machines more similar to humans in terms of cognitive abilities. To achieve this, AGI systems need to be able to learn from past experiences and use that knowledge to adapt to new situations and make decisions in the present.

As such, AGI would exhibit traits such as self-awareness, creativity, planning, learning, natural language processing, and social intelligence, which all require the development of machines that can understand and solve complex tasks with minimal or no human intervention. It’s an ambitious goal, and current AI technology is far from achieving this level of capability.

Robots Are Quietly Brainwashing China’s Talent

Above, I pointed out the disturbing trend of students learning Chinese-English skills directly from machine translators, that is, actively learning how to be a robot. The problem is, at the rate robots are paid, a translator, or really any white-collar worker, would never be able to make more than a dollar a day by emulating these same robots. Instead, a different pattern emerges where virtually any white-collar worker, student, or even professional translator tasked with completing a translation immediately turns to a machine translation engine and simply copies and pastes into the answer box, before sending the machine-generated output back to the boss. The answer is, understandably, totally wrong and unreliable — but they don’t know this, because everything they learned originally came from the very robot they are, in theory, supposed to be supervising.

However, I was not the first person to discover this issue. Actually, it was Professor Tekwa at Guangzhou Foreign Studies University who discovered the issue years ago during his research on human-AI interaction. There are even a handful of researchers looking specifically at what happens when Chinese trainees given Chinese-to-English translation tasks work with artificial intelligence. Uniformly, everyone is simply turning their entire job over to the robot. Moreover, if tested, the robot is only about 15-20% accurate, meaning that 80% of the results on professional topics are basically misleading. For the machine translation companies, this is great news: everyone is saying “Wow! Look how great this machine translation is!” The problem is, however, that the human here has been brainwashed by robots to think like a robot. This is a particularly significant problem in China and one which needs to be carefully avoided.

In addition to never actively using a machine translation tool as a reference, to look something up, or to understand something better, Chinese-English translators need to diligently avoid encountering machine-translated information appearing in their references. In particular, consider how Youdao has an online dictionary and Baidu has an online encyclopedia, both of which contain a lot of tips on how to say things in English. Both Youdao and Baidu develop machine translation engines and, if you look closely, a lot of the suggestions and recommendations on either platform come directly from their machine translation engines. Unsurprisingly, many are outrageously wrong. White-collar workers and professional translators working with Chinese-English translation can avoid being brainwashed and stupefied into unemployment by simply learning from genuine human sources and avoiding machine-generated content.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.