Machine Translation Unlikely Substitute For Human Decision-Making

Image: Unsplash

Are we finally on the brink of machine translation catching up to human translators? Recent coverage of neural machine translation (NMT) seems to suggest so. Beyond the justified skepticism about what machine translation (MT) can achieve, this attitude overlooks the choices we make in translations.

As Good As Human Translation?

The recently developed Google Neural Machine Translation has been hailed as producing superior results to (currently available) phrase-based machine translation and even approaching human translation quality. The translation company Systran has announced its own version of neural machine translation. However, as Kirti Vashee points out, Google’s method of scoring translations ends up overstating the actual improvements in the output. Experts interviewed by Slator also questioned the methodology used to assess the progress.

Many of the claims centered on whether neural machine translation was “nearly indistinguishable from human translation.” In fact, the basis for scoring Google’s translation was comparing machine-translated excerpts from a variety of texts with their human-translated counterparts. However, there is little discussion of what makes a good human translation or a good translation overall.

Automation and Decision-Making

At this stage of technological development, we would likely not brand a company with a computer-designed logo or publish computer-written fiction without having a human direct or edit the output (advances in computer-generated journalism notwithstanding). It is thought that when technology automates some of the menial jobs, humans will still be needed for the more creative tasks.

In my view, what makes these jobs hard to automate — at the current state of artificial intelligence — is the decision-making process involved. While a car being assembled has exact specifications of the final product, a logo or a marketing text is ostensibly a more open-ended task, where the final product isn’t obvious at the beginning of the process.

A “Perfect” Translation?

Image: Unsplash

Yet translation is somehow treated differently. It is tempting to discount the infinite-possibility decision-making process involved — after all, the source text has already been written and it would seems that all decisions have been made. All that’s left to do is recast them in the other language. Indeed, Google limits its criteria of a “perfect” translation to “the meaning of the translation [being] completely consistent with the source, and the grammar [being] correct.”

This approach implies there is one correct translation, and the task of both human and machine translators is to arrive at it. However, there is arguably more than one acceptable output, depending on the purpose and target audience. A functional approach to translation postulates that

[It] is not the source text as such, or its effects on the source-text recipient, or the function assigned to it by the author, that determines the translation process, … but the prospective function or skopos of the target text as determined by the initiator’s, i.e. client’s, needs.

For example, the same public health brochure may justifiably have to be translated differently for a Russian-speaking population in the US as compared to a Russian-based target audience. The first group is more likely to be familiar with US-specific healthcare concepts, such as “co-pay” or “nurse practitioner,” whereas the second group will need an explanation or adaptation. The same is true for cases when an accurate translation evokes negative connotations.

Making the Choice

We see that most utterances in the source language allow for several adequate translations. Does that mean that machine translation that produces any of these potentially acceptable translations at random has fulfilled its purpose?

While I am not qualified to comment on the programming behind neural machine translation, according to published research, the probability of a certain translation occurring in the set the NMT system was “trained on” is taken into account when making the final choice. In other words, in the best case scenario, NMT will pick a reasonable, grammatically correct, most likely translation based on its training dataset.

For many text types, this may be quite satisfactory. But is the most likely or the most common choice always the most appropriate one? Even after machine translation has surmounted the challenges of grammar and syntax, which is no small feat, I believe many clients and authors who care about their message will still rely on the judgment of the human translator — if only to make sure the machine made the right choice.

Three False Assumpions About Loanwords in Russian

hi-tech gadgets
Image by Marco Bonomo

We know that languages borrow words for new technology or occupations. We know that a lot of these words come from English. It is easy to assume, then, that all cutting-edge technology must have originated in the English-speaking world and was exported everywhere else, along with its nomenclature.

While it is partly true in the case of Russian, which has borrowed multiple words to describe new devices, workflows, and professions that flourished in Russia in the 90s, there are important exceptions to the general trend.

1. Modern Technology must Be Described in Loanwords

I noticed a curious thing in my interactions with Russian-speaking immigrants in Israel and the US — they would often use loanwords from English to describe technology everyone in Russia would use a native word for. For example, they would say си-ти” (si-ti) for a CT scan — a medical procedure widely known and available in Russia under the name of КТ or компьютерная томография (kompyuternaya tomografiya). Similarly, elderly immigrants would say “месседж” messedz when asking a called to leave a voice mail message (the conventional Russian word is “сообщение” soobscheniye).

This likely happened because these people left the USSR before the spread of the technologies they described, like answering machines or CT scans. Therefore, they never learned the Russian words for these things and have to resort to the words they have heard — English ones. The takeaway here is to check whether there is an established native term before you settle for the borrowing.

2. All Loanwords Come From English

Image by Marc Chouinard

Another example I’ve run into is expatriate Russians saying “таблетка” (tabletka) for “tablet.” While this is the appropriate equivalent for the medicine you take, the touchscreen device is normally called by the French word “планшет” (planshet). This word used to refer to a board for mounting maps and, later, a graphics tablet. When tablet computers appeared on the market, Russian just expanded the definition of that word rather than borrow a new one.

3. Every Loanword Has A Native Equivalent

What I wrote so far seems to suggest there is always a native word for any new gadget — you just need to look hard enough. However, many loanword have long been accepted as the only official names for certain devices, such as принтер (printer) or сканер (scanner). These words are used in official documents such as GOST certificates, needed to sell the device in Russia.

Moreover, some recent borrowings have taken on a specific narrower meaning that is not inherent in the native word. One example is менеджер (menedzher), roughly equivalent to the English “manager” but mostly reserved for management roles in new types of companies, introduced in the last 20 years. You could argue that the Russian word управляющий (upravlyayushiy) describes the same occupation. However, that word evokes Chekhov’s plays and a male housekeeper left to look after an estate while the owners are abroad.

As with anything else in language, careful research is needed to make sure you are neither happily accepting any trendy borrowings nor ignoring long-standing, standard ones in your authoring or translation.

February Issue: Naïve Comments About Russian

Russian-Hebrew-English Keyboard
image by aleazzurro

Having been exposed to several languages for most of my life–even though I was far from fluent in most of them–I am sometimes amused by the questions I get about Russian. I do not mean to be judgmental as I fully understand that a person who has never been exposed to a non-Latin-script language or, in case of the US, to any language but English cannot have an understanding of how multi-script input and communication works. So, I thought it was worth going over some of the questions I’ve been asked and provide a short explanation.

“How do you type in Russian?”

(I have actually been asked that)

If you have Windows on your computer, you can install additional input keyboards in the “Region and Language Options” menu in the Control Panel. Once you have several languages installed, you should see a language bar. You can switch languages by choosing one from the Language Bar list. However, I normally toggle languages by pressing Alt+Shift. You get used to doing that if you run a non-English Windows because you still need to switch to English to input URLs and email addresses.

“I take it you’re Greek”

(librarian checking out Russian books to me)
I think the reason for this one is the fact that people recognize some of the letters they’ve seen in math problems or on fraternity houses. It is true that the Cyrillic alphabet is partly based on the Greek alphabet, as is the Latin alphabet. So, I can’t really blame people who aren’t regularly exposed to non-Latin alphabets for mistaking one for another.

Do you ever get naïve comments about your working language?