For 20 years, a department within the University of Macau has been a pioneer in creating computer systems for Portuguese and Chinese translation used by government departments, law firms, schools, and translation companies.
The university’s Natural Language Processing and Portuguese-Chinese Machine Translation Laboratory (NLP2CT) recently developed and tested its next generation online Chinese-Portuguese-English translation platform.
“A computer will never replace a human translator,” said Prof Derek F Wong, associate professor and head of the laboratory. “A professional interpreter is still far ahead. What our systems do is the basic translation, leaving people to do the more difficult and challenging work. We save their time and energy.”
Their platform functions as a “translation management system,” enabling users to monitor the progression of current translations as well as access a history and glossary of what has previously been translated.
Where some languages, through their close history, make for relatively straight-forward translation, Portuguese and Chinese could not be more different.
Portuguese is a Latin-based language written in the Roman alphabet. As with most modern Romance languages – French, Italian, etc – nouns are masculine or feminine and verbs change according to the tense and subject. These particularities can present a challenge to non-native speakers.
On the other hand, Chinese is based on characters which do not change their form. Nouns have no gender; verbs do not change – the time element is expressed by adding an additional character like ‘past’ or ‘future’. The problem for students is that each character has to be learned from scratch; only Japanese and Korean-speakers have a head start, because their languages use many of the same characters.
What Wong and his team at NLP2CT have done is create a machine translation programme that converts a text from the source language into the target language. Their neural machine translation (NMT) is a cutting-edge model that uses deep-learning algorithms to learn linguistic rules and compute statistical modelling on its own from parallel bilingual text corpora for text translation.
The previous rule-based (RBMT) and statistical machine translation (SMT) methods operated on symbolic representation, making it impossible to abstract away the meaning of text. By contrast, NMT uses distributed vectors to represent words or sentences, which is a form of representation in semantic space. This allows NMT to generalise well the similarities between words and the dependency relations of words.
The translation of a sentence is carried out in two steps: the source sentence is first transformed into a vector representation using a neural network. With this sentence representation, another neural network is used to recover the sentence in its target language. The process is known as the encoding and decoding. Instead of evaluating the translations of words or phrases as its predecessors did, NMT considers the entire sentence in translation. This makes the translations produced by the NMT more accurate and fluent.
Needs of the handover
Ahead of Macao’s handover to China in 1999, the government found itself with a large volume of Portuguese documents to translate but no tools to do it. There were systems to translate English into Chinese and vice versa, but not for Portuguese.
That’s where Wong’s team came in, developing the PCT (Portuguese-Chinese Translation) Dictionary for release in 1999. “It was the first such dictionary with pronunciation in Mandarin, Cantonese, and Portuguese; it was cutting edge,” Wong noted. “Before, Macao had no specialists in this area. Our university has developed outstanding students. Some have gone to work in large firms in the mainland, like Alibaba, or set up their own firms.”
The university registered the PCT system as a trademark; it is now in its third system.
Government interest in their research has grown in recent years, garnering SAR government support for the lab. In 2016, the Macao Polytechnic Institute built on its own history of training translators by setting up a similar laboratory, which Wong sees as “a good development.”
Wong’s team has worked with top universities in Portugal, Brazil and mainland China, such as Tsinghua University, to deepen its research and transform research results into marketable products for the Chinese-Portuguese translation market in Macao and the broader Greater Bay Area (GBA).
Their second product, PCT Assistente, debuted in 2003 as the first Chinese-Portuguese machine-aided translation system. While the first iteration of PCT Assistente embedded directly into the Microsoft Word environment, Um2T, another product from the lab, offers an online interactive Chinese-Portuguese machine translation system.
Accessible to translators around the world, Um2T employs state-of-the-art neural machine translation architecture and technology. To help train language professionals, the university developed a Portuguese verb auto-analysis and generation system, as well as an online Portuguese language learning platform; both are accessed over 1,000 times a day.
“This is work that requires cooperation between different departments of the university,” Wong explained. “We work with teachers of Portuguese and specialists in language. We need these in addition to computer skills.”
Wong himself learned Portuguese in secondary school in Macao and studied for one year in Portugal. He and Dr Lidia S Chao lead a team of ten: five studying for PhDs, four for Masters, and one research assistant.
The team at the laboratory has won numerous prizes, including a second prize at the first Macau Science and Technology Awards in the Science and Technology Progress Award category for the technologies and applications of their Portuguese-Chinese machine translation system.
Five years later, in 2017, the neural-based machine translation systems developed by the laboratory won several prizes in a competition organised under the 13th China Workshop on Machine Translation.
The team receives funding from the University of Macau and from the Science and Technology Development Fund (FDCT) in the form of grants for research projects. Less time has been devoted to selling its individual products in the market.
Meeting market demand
Demand is growing rapidly for high-quality machine translation systems, in government, business and culture. One reason is implementation of the Belt and Road Initiative, which has increased contacts in all sectors between China and Portuguese-speaking countries (PSC).
Another is Macao’s unique role as a bridge between China and the PSC, as well as its aim to become a ‘smart city’. Macao has the largest number of native Portuguese-speakers of any city in China, between 6,000 and 7,000. It also has an abundant supply of language professionals, with local universities offering courses in a number of languages.
In 2013, each government department translated on average 690,000 words, at an average cost of MOP1.1 million (US$123,655), with an average oral translation of about 101 hours.
No wonder then that, according to official figures issued in 2014, 51 out of 55 SAR government departments surveyed said that they needed translators. They projected that, by 2020, government departments would need an additional 164 Chinese-Portuguese and 54 Chinese-English interpreters, an increase of 71 per cent over 2013.
While the primary language of the SAR government is Chinese, it must translate laws and major administrative announcements into Portuguese.
There is also a boom in the study of Portuguese throughout China. Now 45 universities in China, including Macao, offer courses in it, increasing from just a handful 20 years ago. All this is a result of globalisation and the rapid increase in the exchange of goods and people, especially from China. Chinese companies are aggressively investing and trading in Brazil and other PSC.
According to Common Sense Advisory Research (CSA Research), the market for translation services in China will reach US$3.5 billion in 2020, up from US$2.8 billion in 2016.
They found that, in 2015, Europe accounted for the majority (53.9 per cent) of the world’s translation market, followed by North America (34.82), Asia (10.49), and Africa (0.11). While the US remains the largest single market for translation services – followed closely by Europe – Asia is the largest growth area for the industry.
A personal touch
For all of the advances made in machine learning, when heads of state meet each other, they are accompanied by interpreters – not a robot or computer.
This is because communication isn’t limited to words. Gestures, facial expressions, tonality, and even the words left unsaid, play a role in these interactions. Many languages use wording that is intentionally ambiguous and subtle elements can make all the difference.
When the stakes are this high – trade deals, alliances, armed conflicts – there’s no room for error.
That’s where a human touch is still imperative, to navigate the multitude of inputs that even Prof Wong’s high-quality machines still miss. For the rest of us, eager to explore and interact in this ever-globalising environment, tools like those developed at University of Macau make understanding possible.
Text Ou Nian-le
Photos António Sanmarful