Legal Chinese Translation Word Counts: Two Deceptive Practices

Legal Chinese Translation Word Counts: Two Deceptive Practices

Western law firm clients often fall victim to deceptive Chinese translation pricing.  Confusion occurs because the Chinese language does not have words as we know them in English, rather linguists count “bigrams” (cihui) which are composed of “characters,” which are their base unit. Mandarin uses a unique expression for “words,” danci, which does not apply to the language itself. It also has several different words to describe “characters.” Moreover, computers have no way of distinguishing a Chinese character from a bigram or idiom. The subtle differences have created confusion that many translation companies exploit to overcharge clients.

 

Deceptive Practice 1: Characters (no spaces) pricing

Pricing based on the number of Chinese Characters in the original document is generally the most reliable and accurate way to price Chinese translations and is a common practice in China. However, the English word for “characters” actually translates into two different words in Chinese, zifu or hanzi, and many Mandarin translators companies deliberately choose the one for typographical characters, which inflates counts–often resulting in prices 3 or 4 times higher than the accurate count. To add to the confusion, Microsoft Word counts Chinese characters as “words,” whereas the “characters” count adds in all typographical symbols, letters, and numbers. Thus, Mandarin translation companies often charge their English-speaking clients for the latter in order to triple their rates.

I first discovered this tactic several years ago from a friend in Hong Kong, the president of a trading company acting as an intermediary between mainland China and the United States and Europe. He related to me that his company ordered a Chinese to English translation, communicating in English with a translation company in mainland China that promised good quality at great rates. He ordered the translation of trade-related documents that also included long numerical tables, originally produced in Microsoft Excel, contained within Microsoft Word documents. The documents contained approximately 20,000 Chinese characters and translated to roughly 17,000 English words.

When the Hong Kong businessman received the bill, he was shocked: the translation company had billed him for over 200,000 characters or about 10 times the number it had originally promised. The translation company went on to explain that it charged per “character” using the “characters (no spaces)” count in Microsoft Word and that most Chinese companies used this method. In fact, the applicable national People’s Republic of China standard defining translation pricing defined this as the ‘standard’ pricing method! There is only one problem with this: each character within a serial number or English word would count as a character for pricing purposes. Therefore, while the row under item description and serial number “Capacitator 3945Q-ZA740” contained just two Chinese characters, pricing under this method would run 22 characters. Moreover, this was enforceable, and ultimately, the Hong Kong-based company wound up footing the bill.

This method is deceptive because when you think of the idea of a Chinese “character,” you are not thinking of each digit of a serial number, rather you are thinking of whole words. Serial numbers only need to be copied and pasted and don’t cost the translator much time. The core danger of this method is that you could send someone a document that contains a list of serial numbers, or any other kind of data that does not need to be translated and pay an inflated bill without realizing it.

 

Deceptive Practice 2: Bloated English Writing

A large Washington-based law firm’s regulatory compliance practice had once asked me to review the translations of China corporate law documents and records translated by a translator who boasted strong credentials, such as being the “FBI’s top-rated translator.” The final translations delivered, however, were written extremely poorly in order to take advantage of pricing based on the total number of English words in the output translation document. To understand how they did it, you will need to understand a bit about the Chinese language structure and how it differs from English.

Chinese simply lacks many of the words found in the English language, such as prepositions, adverbs, and articles. Moreover, Chinese writers will often omit the subject of a sentence, which is just implied. The parts of speech used in English are quite different from what appears in an equivalent Chinese sentence, and there are a lot of components of a Chinese sentence, such as classifiers and modal particles, that simply do not appear in their English equivalents. What this means for translation pricing is that a translator could directly translate unique Chinese grammatical components into their closest English equivalents, while also structuring English sentences in a way so as to maximize the number of purely structural elements such as prepositions and articles, thus maximizing word counts.

Consequently, someone in a board meeting may use two Chinese characters at the start of a sentence to say, “We suggest…” (jian yi), but every time that this word appeared in the transcript, the translator would write in, “It would be suggested that…” As a result, that translator would bill the client for 5 words for the original Chinese word count of two characters and my standard English translation of “2.” Every time the two-word phrase “board meeting” appeared, the translator would write “meeting of the board of directors,” thus tripling the word count. Consider a flabby sentence compared side-by-side with an economical one:

Inflated: “It would be suggested that we ought to record that at this instance of the meeting of the board of directors, the headcount at the meeting of the board of directors does not satisfy the requirements for a quorum as stipulated in the Articles of Incorporation for an instance of the meeting of the board of directors.”

Corrected: “I suggest we record that this board meeting hasn’t reached the quorum requirements provided by the Articles of Association.”

One little 20-word suggestion got bloated out to 60 English words. Overall, roughly 40% of the word count in these files was simply a pile of filler added to pad out the word count. Moreover, the reviewing attorney would have a really hard time figuring out exactly what had been said in sentences that were particularly vulnerable to bloating, which would also make them more likely to make mistakes. This also makes it harder for the attorney to understand what the sentence is trying to say, wasting a great deal of time.

 

Honest Pricing

Fortunately, there are ways to transparently and honesty price Chinese documents for translation. In doing this, I advocate that two general principles be followed: first, that the results be as consistent as possible and as closely tied to the actual amount of time necessary to complete the translation as possible; secondly, a translation buyer with limited knowledge of translation or of the Chinese language should be able to verify the word count of the file to be translated.

Microsoft Word, a universally recognized program, has made efforts to accurately count the number of words in a document in any and every language and provides relatively consistent results. A document written in German or English can produce a word count, and a similar word count is produced when a document written in Chinese is placed into Microsoft Word. Thus, Microsoft Word is a frequently used method by which the actual number of Chinese words and characters in a document can be counted.  Moreover, clients can easily verify the correctness of the word count.

The use of Microsoft Word for word counts produces a very consistent relationship between the source and target versions of the same text. In fact, the word count results are very similar to what you would see when comparing Spanish and English translations. For example, in Spanish, the ratio between the Spanish word count to the English word count will usually be about 1.2:1, which allows for smooth and consistent pricing. The ratio between the Chinese word count to English is also consistently around 1.2:1, thus the pricing will more consistently track the amount of actual work done, without providing any occasional extremely inflated word counts or allowing for deliberate manipulation by the translator.

 

How to count words in scanned documents?

Scanned documents are more difficult to run word counts on because data in such documents is generally not stored in a copyable format. Nonetheless, these problems are easily surmounted. Optical Character Recognition (OCR) software has made rapid progress these past few years, and several software platforms can now count the number of Chinese characters in a document with reasonable accuracy. In particular, I have found Abbyy Finereader effective for word counts. While it may not read the document content with full accuracy, Finereader will typically replace it with the most similar Chinese characters. This is possible because Chinese typography uses even character spacing, so optical character recognition software can place Chinese characters on a grid and make accurate assumptions about where each character should be placed. This geometry would not be a viable assumption for languages using the Roman alphabet, where word spacing is very uneven.

Thus, even if the original document is a scan of an executed contract, we will still be able to determine the number of characters in the original with reasonable reliability for most kinds of legal documents. So long as employees are directed to scan documents using a high-resolution flatbed scanner, and not with a flawed method such as taking cell phone pictures, which itself causes serious document distortion, we can easily and reliably count the number of characters in the source document. For signatures and other handwritten comments are usually transcribed into the document. At CBL Translations, we typically have our office staff correct issues in text recognition to achieve more accurate word counts and to improve accuracy.

The one main exception to the above is certificates and similar documents that are printed on Chinese forms, which are notoriously difficult for optical character recognition programs to interpret. These certificates might include certificates of incorporation or marriage certificates. The root of the difficulty often lies in how the text is arranged on these forms, especially when many seals are applied in a circular pattern. Certificates are best priced on a flat, per-certificate basis since the number of words on a certificate page does not vary much from one certificate to another. As with the above, I do not recommend that these be priced based on the number of English words that appear in the translated version of the document.

 

Conclusion

Translation companies have long taken advantage of translation buyers’ ignorance of the Chinese language when pricing translations from Chinese to English. In particular, dishonest translators have become adept at charging for non-word characters that do not need to be translated. When counting words based on the translated document, most Chinese translators actually spend time finding ways to meaninglessly bloat the word count in order to charge inflated prices.

There is no need to overpay for Chinese legal translations. The existence of software tools, particularly Microsoft Word and Abbyy Finereader, makes it easy to determine the number of real words in the document and ensure that you, the translation buyer, can easily verify the word count of the document that you are sending off for translation. While computers may not yet be able to produce sophisticated Chinese word counts, they can produce reliable and accurate word count statistics that closely conform to the actual amount of time necessary to translate a document.

 

More to read:

Learn about how different kinds of translation projects should be priced, and what kind of provider is best suited to the task, with our interactive pricing calculator

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.