Think Human Translators Will Be Replaced By Machines? Not So Fast!

In line with the previous piece about corporate narratives discouraging cultural exploration and language learning, there is a corollary that I hear more often and sadly some people whom I respect very deeply still believe it:

Namely, the idea that translation, along with many other jobs, will be replaced entirely by machines (again, a lot of misinformation that I’m going to get into momentarily)

My father went so far to say that my translation job wouldn’t be around in a few years’ time.

Iso an Jekob

I don’t blame him, he’s just misinformed by op-eds and journalists that seek to further an agenda of continued income inequality rather than actually looking at how machine translation is extremely faulty. After all, fewer people believing that learning languages is lucrative means that fewer people learn languages, right? And money is the sole value of any human being, right?

I am grateful for machine translation, but I see it as a glorified dictionary.

But right now even the most advanced machine translation in the world has hurdles that they haven’t even gotten over, but haven’t even been ADDRESSED.

I will mention this: if machine translation does end up reaching perfection, it will almost certainly be with very politically powerful languages very similar to English first. (The “Duolingo Five” of Spanish, French, Italian, German and Portuguese would be first in line. Other Germanic Languages, with the possible exceptions of Icelandic and Faroese, would be next.)

If the craft “dies” in part, it will be in this sector first (given as it is the “front line”). Even then, I deem it doubtful (although machine translation reaching perfection from English -> Italian is a thousand times more likely than it reaching perfection from English -> Vietnamese) But with most languages in the world, translators have no fear of having their jobs being replaced by machines in the slightest.

Because the less powerful you get and the further you get away from English, the more flaws show up in machine translation.

Let’s hop in:

 

  • Cultural References

 

Take a look at lyricstranslate.com (in which using machine translation is absolutely and completely forbidden). You’ll notice that a significant amount of the song texts come with asterisks, usually ones explaining cultural phenomena that would be familiar to a Russian- or a Finnish-speaker but not to a speaker of the target language. Rap music throughout the world relies heavily on many layers of meaning to a degree in which human translators need to rely on notes. Machine translation doesn’t even DO notes or asterisks.

Also, there’s the case in which names of places or people may be familiar to people who speak one language but not those who speak another. I remember in Stockholm’s Medieval Museum that the English translation rendered the Swedish word “Åbo” (a city known in English and most other languages by its Finnish name “Turku”) as “Turku, a city in southern Finland” (obviously the fluent readers of Scandinavian Languages needed no such clarification).

And then there are the references to religious texts, well-known literature, Internet memes and beyond. In Hebrew and in Modern Greek references to or quotes from ancient texts are common (especially in the political sphere) but machine translation doesn’t pick up on it!

When I put hip-hop song lyrics or a political speech into Google Translate and start to see a significant amount of asterisks and footnotes, then I’ll believe that machine translation is on the verge of taking over. Until then, this is a hole that hasn’t been addressed and anyone who works in translation of cultural texts is aware of it.

 

  • Gendered Speech

In Spanish, adjectives referring to yourself are different depending on your gender. In Hebrew and Arabic, you use different present-tense verb forms depending on your gender as well. In languages like Vietnamese, Burmese, and Japanese different forms of “I” and “you” contain gendered information and plenty of other coded information besides.

What happens with machine translation instead is that there are sexist implications (e.g. languages with a gender-neutral “he/she” pronoun such as Turkic or Finno-Ugric Languages are more likely to assume that doctors are male and secretaries are female).

Machine Translation doesn’t have a gender-meter at all (e.g. pick where “I” am a man, woman or other), so why would I trust it to take jobs away from human translators again?

On that topic, there’s also an issue with…

 

  • Formality (Pronouns)

 

Ah, yes, the pronouns that you use towards kids or the other pronouns you use towards emperors and monks. Welcome to East Asia!

A language like Japanese or Khmer has many articles and modes of address depending on where you are relative to the person or crowd to whom you are speaking.

Use the wrong one and interesting things can happen.

I just went on Google Translate and, as I expected, they boiled down these systems into a pinhead. (Although to their credit, there is a set of “safe” pronouns that can more readily be used, especially as a foreign speaker [students are usually taught one of these to “stick to”, especially if they look non-Asian]).

If I expect a machine to take away a human job, it has to do at least as well. And it seems to have an active knowledge of pronouns in languages like these the way a first-year student would, not like a professional translator with deep knowledge of the language.

A “formality meter” for machine translation would help. And it would also be useful for…

 

  • Formality (Verb Forms)

 

In Finnish the verb “to be” will conjugate differently if you want to speak colloquially (puhekieli). In addition to that, pronouns will also change significantly (and will become shorter). There was this one time I encountered a student who had read Finnish grammar books at length and had a great knowledge of the formal language but NONE of the informal language that’s regularly used in Finnish-Language vlogging and popular music.

Sometimes it goes well beyond the verbs. Samoan and Fijian have different modes of speaking as well (and usually one is used for foreigners and one for insiders). There’s Samoan in Google Translate (and Samoan has an exclusive and inclusive “we” and Google Translate does as well with that as you would expect). I’m not studying Samoan at the moment, nor have I even begun, but let me know if you have any knowledge of Samoan and if it manages to straddle the various forms of the language in a way that would be useful for an outsider. I’ll be waiting…

 

  • Difficult Transliterations

 

One Hebrew word without vowels can be vowelized in many different ways and with different meanings. Burmese transliteration is not user-friendly in the slightest. Persian and Urdu don’t even have it.

If I expect a machine to take my job, I expect it to render one alphabet to another. Without issues.

 

  • Translation Databases Rely on User Input

 

This obviously favors the politically powerful languages, especially those from Europe. Google Translate’s machine learning relies on input from the translator community. I’ve seen even extremely strange phrases approved by the community in a language like Spanish. While I’ve seen approved phrases in languages like Yiddish or Lao, they’re sparse (and even for the most basic words or small essential phrases).

In order for machine translation to be good, you need lots of people putting in phrases into the machine. The people who are putting phrases in the machine are those with access to computers, not ones who make $2 a day.

In San Francisco speakers of many languages throughout Asia are in demand for being interpreters. A lot of these languages come from poor regions that can’t send a bunch of people submitting phrases into Google Translate to Silicon Valley.

What’s more, there’s the issue of government support (e.g. Wales put its governmental bilingual documents into Google Translate, resulting in Welsh being better off with machine translation that Irish. The Nordic Countries want to preserve their languages and have been investing everything technological to keep them safe. Authoritarian regimes might not have the time or the energy to promote their languages on a global scale. Then again, you also get authoritarian regimes like Vietnam with huge communities of expatriates that make tech support of the language readily available in a way that would make thousands of languages throughout the world jealous).

 

  • Developing World Languages Are Not as Developed in Machine Translation

 

Solomon Islands Pijin would probably be easier to manage in machine translation that Spanish, but it hasn’t even been touched (as far as I know). A lot of languages are behind, and these are languages spoken in poor rural areas in which translators and interpreters are necessary (my parents worked in refugee camps in Sudan, you have NO IDEA how much interpreters of Tigre were sought after! To the degree in which charlatans became “improvisational interpreters”, you can guess how long that lasted.)

Yes, English may be the official language of a lot of countries in Africa and in the Pacific (not also to mention India) but huge swathes of people living here have weak command of English or, sometimes, no command.

The Peace Corps in particular has tons of resources for learning languages that it equips its volunteers with. Missionaries also have similar programs as well. Suffice it to say that these organizations are doing work with languages (spanning all continents) on a very deep level where machine translation hasn’t even VENTURED!

 

  • A Good Deal of Languages Haven’t Been Touched with Machine Translation At All

 

And some of this may also be in part due to the fact that some of them have no written format, or no standardized written format (e.g. Jamaican Patois).

 

  • Text-To-Speech Underdeveloped in Most Languages

 

I’m fairly impressed by Thai’s Text-to-Speech functionality in Google Translation, not also to mention those of the various European Languages that have them (did you know that if you put an English text into Dutch Google Translate and have it read out loud, it will read you English with a Dutch accent? No, really!)

 

And then you have Irish which has three different modes of pronunciation in addition to a hodge-podge “standard” that is mostly taught in schools and in apps. There is text-to-speech Irish out there, developed in Trinity College Dublin, It comes in multiple “flavors” depending on whether you want Connacht, Ulster or Munster Irish. While that technology exists, it hasn’t been integrated into Google Translate in part because I think customization options are scary for ordinary users (although more of them may come in the future, can’t say I know because I’m not on the development team).

 

For Lao, Persian, and a lot of Indian regional languages (among many others), text-to-speech hasn’t even been tried. In order to fully replace interpreters, machine translation NEEDS that and needs it PERFECTLY. (And here I am stuck with a Google Translate that routinely struggles with Hebrew vowelization…)

 

  • Parts of Speech Commonly Omitted in Comparison to Other Languages

 

Some languages, like Burmese or Japanese, often form sentences without any variety of pronoun in the most natural way of speech. Instead of saying “I understand” in Burmese, you would literally say “ear go-around present-tense-marker” (no “I”, although you could add a version of “I” and it would still make sense). In context, I could use that EXACT same phrase as the ear going around to indicate “you understand” “we understand” “the person behind the counter understands”.

In English, except in the very informal registers (“got it!”) we usually need to include a pronoun. But if machine translation should be good enough to use in sworn interviews and in legal proceedings, they should be able to manage when to use pronouns and when not to. Even in a language like Spanish adding “yo” (I) versus omitting it is another delicate game to play, as is the case with most languages in which person-information is coded into the verb (yo soy – I am, but soy could also mean “I am” as well)

Now take a language like Rapa Nui (“Easter Island Language”). Conjunctions usually aren’t used (their “but” comes from Spanish as a loan word! [pero]). Now let’s say a machine has to translate from Rapa Nui into English, how will the “and” ‘s and “but” ‘s be rendered in a way that is natural to an English speaker?

 

Maybe the future will prove me wrong and machine translation will be used in courts instead of human beings. But I’ll come closer to believing it when these ten points are done away with SQUARELY. Until then, I’ll be very skeptical and assure the translators of the world that they are safe in their profession.

 

 

ga

The Tongan Way

It was my life’s dream to begin learning languages of the Pacific since I was a kid. The fact that I haven’t done so for decades is confusing to me, but perhaps one reason I started it this late was because I needed to hone my techniques and confidence, both of which are required in greater depth if you want to take on languages that virtually no one you know of is likely to learn or speak.

Granted, I have been speaking Tok Pisin since 2014 and Bislama and Solomon Islands Pijin since 2016. What I mean when I say “Pacific Languages” are those truly indigenous to the region, and this year brought me into the arms of three of them specifically: Palauan, Tongan and Kiribati / Gilbertese. I’ve been extremely fascinated by all of them (and I would say at this juncture that Palauan is the hardest and Kiribati the easiest).

But today I’m going to talk about Tonga, because today is Tonga National Day.

Tonga is a country that continues to hold very strongly to its traditions, being an absolute monarchy even today, as well as one in which Christian identity is taken very seriously. What’s more I would venture that most people learning the Tongan Language might be doing so because they are missionaries.

The language itself is fascinating on every level and works unlike any other language I’ve seen. Let’s hop in!

The pronunciation, like many Austronesian languages (or “Languages of the Southern Islands” which stretch all the way from Madagascar to Easter Island / Rapa Nui) is extremely straightforward. You have a, e, i, o and u, pronounced virtually the EXACT same way as they would be pronounced in Spanish. If you see a line over any of these vowels, hold it a little bit longer. This principle, thanks to Finnish (which employed lengthened vowels very similarly but uses “aa” instead of “ā”, was not foreign to me.

Tongan also has a glottal stop, noted as the “ ‘ “ character. This is trickier, and it is pronounced with something like the breathing sound in the middle of “uh-oh”. (In singing this is extremely difficult to hear!)

Now let’s introduce you to what is probably the most commonly known Tongan word abroad, an interjection that literally serves ALL purposes (surprise, joy, anger, excitement), “ ‘oiaue!” Yup, all of the vowels and a glottal stop. Also super fun to say!

The consonants will not be slurred and, much like in a language like Hebrew, always have the exact same pronunciation! Given how similar they are to English that’s not something you need to worry about.

One aspect in which Tongan has really caused me to enter a word of mental gymnastics is the fact that, instead of indicating a tense with changing a verb, you use a “tense marker”.

Let’s give an example. “ ‘oku” is a present tense marker, so if you see it, a verb that follows it will be in the present tense. And PRONOUNS also change with tenses accordingly!

 

Ou -> present “I”

Ku -> past “I”

U -> future “I”

 

‘oku ou -> I am (lit. present-marker I-present)

Na’a ku -> I was (lit. past-marker I-past)

Te u -> I will be (lit. future-marker I-future)

 

You put verbs afterwards

Now there’s yet ANOTHER version of “I”, one that is utilized when it goes at the end of the sentence after a verb. “au”

 

‘Oku ‘alo ki kolo ‘a au. = I am going to town.

Present-marker go all-purpose-preposition town verb marker I-post-verb-version

 

The various pronouns in Tongan are all calibrated in various ways according, complete with exclusive and inclusive versions “we” as well as a singular / dual / plural system.

 

Much like some other languages, Tongan also has an interesting way of asking “what” you ask “Ko e hā e meʻa” = “what is the thing …”

It can be set up in other ways:

 

Ko e hā e lea faka-Tonga ki he _____?

What is the Tongan Word for _____?

 

The word “ko” is translated in so many ways and used is so many constructs that it’s dizzying to even think about how I would begin to describe it on paper. After all, I just got into Tongan a few months ago.

But you probably noticed something about “faka-Tonga”, which translates to “the Tongan Way”, something that is, obviously, at the center of the country’s national identity. Lea faka-Tonga, speaking in the Tongan Way, refers to the Tongan Language (of course).

Faka- as a prefix can also turn any noun into an adjective of the noun. Tonga is, of course, the Kingdom we all know and love (or, at least, we NOW know it and love it!). faka-Tonga turns it into an adjective. Not just Tongan, but an adjectival word that encapsulates everything that is the Tongan Way.

Another thing I didn’t find on the internet so far was how to say “why” in Tongan. That would be “ko e hā … ai” (and it is a sentence construction, with the thing you are asking the “why” about goes in the area with the three dots).

In summary:

  • Pronunciation is extremely easy
  • Verbs don’t really change but you note the tense of a sentence with a word that indicates tense.
  • Pronouns also change for tense too, they also can change depending on sentence structure (there are four forms of the word “I” covered above suited for different situation)
  • I didn’t touch upon it here, but possessives come in two classes. Read more about it on Wikipedia here: https://en.wikipedia.org/wiki/Tongan_language#Possessive_pronouns
  • Idiomatic differences and learning what constitutes a “natural” construct will be your biggest obstacle in learning Tongan (more than anything else would be.

 

What’s more, Tongan also (unsurprisingly) has a lot of English Loan Words, and much like in Japanese, they will be adopted to local spelling conventions. This includes NAMES sometimes!

 

David – Tēvita

Mary – Mele

Science / Scientist – Saienisi

 

Tonga has great music that makes you feel as though you’re on an island (no, REALLY!) as well as a history that features it on center stage locally many times (it has been described as “expansionist”). Pieces of Tongan culture have been featured on the global stage, with the national drink, Kava, having its name COME from Tongan, as well as having costumes from Disney’s Moana / Vaiana modelled after the dress from the Tongan Royal Family, not also to mention it having been a playable mini-civilization in a mode of Civilization V (No joke!!!)

Polynesia as a whole (not to mention the Pacific Island cultures as a group) has been featured in many aspects of both American and Japanese popular culture (it was the colonial frontier for both of them!) As a result, many aspects of any of these cultures will be oddly familiar to you, given how both American and Japanese popular culture have impacted the world.

I have a long way to go with speaking the Tongan way, but it’s been SUPER fun (as well as challenging!) and I can’t wait to see how well I speak Tongan in a year’s time!

And now a song that will get stuck in your head!

Happy Tonga Day, world!

LEA FAKA TONGA