Everything You Know About “How Many Languages Can a Human Learn and Maintain?” is WRONG. Here’s Why…

Possibly one of the emotionally charged topics in the language learning world (and one that no one has good answers to, myself included) is the topic about how many languages a human being can learn.

We will never know the answer to that questions for way too many reasons. Here are some of them:

  • While most language enthusiasts haven’t thought about it (or have been put in a position to think about it), the language vs. dialect debate is getting increasingly muddy. Should the Caribbean English Creoles count as separate languages? The ISO 639-3 codes seem to think so. But would governments think so? How about universities? And obviously different areas where this question is more relevant will approach it differently (such as Jamaica and Italy, two completely different countries).

 

  • There is no definite way to quantify or even qualify proficiencies (except for, maybe, extended interviews on tape or eyewitness accounts of polyglots at conferences or gatherings). Even test results aren’t safe, given how many people may pass them and proceed to forget everything. (And if people can forget their native language, this is certainly also a possibility).

 

  • Human history and, by extension, history of human languages, is too long and too varied to take all the variables into account. I may have said this before in another one of my articles, but in some places like Western Africa or Melanesia, speaking ten languages is seen as normal. In many areas of the west, especially former British colonies, ten languages is seen as nearly superhuman if not in fact outright disbelieved by some people. This is despite the fact that there is no dearth of polyglot videos on the internet.

 

  • In addition to that, different areas of the world and different time periods would measure fluency differently. Mezzofanti, considered by some the greatest polyglot of all time, obviously had no usage for words pertaining to computers in any of his languages given as they did not exist when he was alive. He probably didn’t need to discuss complicated matters of science, either. Also (and this is another thing a lot of languages gurus don’t even realize because the languages they tend to choose) not all languages on the planet have that vocabulary. (In the event that you would talk about it, you would possibly use loanwords, primarily from a colonial language, or even switch into English or another colonial language periodically. However yes, there are some languages that have that vocabulary even though you think they might actually not.)

 

We will never know the answer to how many languages a human being can possibly know, and I highly encourage you to distrust ANYONE who tries to come up with an answer to the question. Because in attacking the question, they get the methodology wrong for all sorts of reasons.

 

Here are some of them:

 

  • Only taking into account their language experiences and those of their friend circle, which tend to be overwhelmingly skewed mostly towards politically powerful languages of Europe and sometimes Asia. Dialect continuums are not accounted for. If you think that Italian and Spanish are the equivalent of closest languages there are, give it some more thought. The Persian Languages are even closer, as are the “BCS” languages (Bosnian, Croatian and Serbia) not also to mention my own pet languages, the Melanesian Creoles (of Tok Pisin, Pijin and Bislama). Not all language counts are created equal, and this point alone would be capable of disqualifying the question altogether, but I’ll go on.

 

  • Not realizing that technology has changed and will continue to change. Mezzofanti didn’t have Memrise and many of the memory tools that I use on a daily basis. Technology has the capability of turning us into superhuman versions of our ancestors. An average person who has trained with contemporary first-person shooter games (which I never play, by the way) would have significantly better reflexes and hand-eye coordination than pretty much ANY soldier that fought in the Second World War. They would be considered SUPER SOLDIERS back then (this was a factoid I picked up from the 2016 Games for Change Conference). But for some reason almost no one considers that a similar thing is also happening for language learning and skill acquisition.

 

  • Using Ziad Fazah’s “Viva el Lunes” performance in order to automatically disqualify anyone who claims to speak 50+ languages. For those unaware, I’ll summarize it in one sentence. Liberian/Lebanese Polyglot who won Guinness Book of World Record’s title for most multilingual person goes on Chilean television, is tested and struggles even with basic sentences in most of his languages. But to dismiss any claims of that nature just because of ONE incident is a logical fallacy, and while I haven’t met anyone who has significantly pulled off that number, I wouldn’t automatically revert to skepticism. Just because of one person who may have likely overestimated his abilities doesn’t mean that we as a species should hold ourselves back. Who knows? There may be someone who may actually speak 59+ languages and who actually CAN show the skills. You never know!

 

I get it. A lot of people have deep insecurities, including many in the polyglot community. The temptation to knock others down or be dismissive only shows defensiveness and maybe a poor attempt to hide your own imposter syndrome. This is why I’m willing to consider anyone’s language proficiency based on claims alone (note I said “CONSIDER” not “definitively judge”, because there is no way to really do that.)

 

  • Using data about famous polyglots that have been dead for centuries (or even those that are STILL ALIVE) in order to draw conclusions as to what human beings in the 21st century can do. Really? In the case of the ones that have been dead for hundreds of years, they’re not relevant to our brains and our technology and our learning abilities NOW. Maybe they could be used in order to speculate about limits before the technological revolutions that happened during my lifetime, but we’re changing now and most people who answer the “how many languages is it possible to know?” question don’t acknowledge how contemporary technology sets our time period apart.

 

  • Different vocabulary thresholds for different languages. One person whose opinion I very much value said that a vocabulary of about 16000 words were required to reach a C2 level (the highest possible level, considered equivalent to a highly educated native speaker) in a language. But here’s the thing: in Bislama (an English Creole that is the primary language of Vanuatu), there are literally about 4,000 words (excluding proper nouns, which would bring the count up to 7,000) IN THE ENTIRE LANGUAGE. So if you speak with one-quarter of that amount with some languages, you get a near-native vocabulary, an advantage not afforded to languages like French and Swedish with significantly larger vocabulary lists (Swedish’s list of loan words from English ALONE is likely larger than the comprehensive vocabularies of the Melanesian Creoles COMBINED). And before you say “well, that’s just concerning Creole languages”, the same variety of comprehensive word counts can also be found the further away you get from the developed world AND the further you delve into languages without as much political support.

 

If there is a definitive limit for amount of languages learned, even to a high level, we will never know what it is, in part because of all of the factors that I lay out here.

 

It’s an interesting mental exercise that, let’s be honest, is usually used to discourage people and create skepticism so that some people can have their egos buttressed, but it’s one with no definitive answer (in the Talmud, we end such debates with the word “teyku”, meaning “let it remain unresolved”. And that’s what we’re going to have to go with this debate as well.)

 

What do I intend to do? Well, for one, I’m going to try my best and learn many languages, some to fluency, others to degrees of curiosity, and I fulfill MY vision. Because if you constantly live in the fear of judgment of others, you’ll never live your full life.

 

And that’s something you deserve to do! Don’t let ANY discouragement get you down!

come back when you can put up a fight

I really need to start using new pictures of myself.

The Biggest Mistake People Make at Language Social Events

come back when you can put up a fight

I have been going to language exchange events for years now (although I’ve been showing up at them less frequently in 2018 due to reasons I cannot disclose quite yet). In some respects it actually teaches me more about human psychology than it does about languages in general.

(It reminds me of the fact that, when I play Interactive Online / .io games, I actually learn more about human psychology rather than strategy as well. I will also never forget the time that someone named his/her character “press ctrl-w to go faster”.)

I’m sorry to have to say this but it really needs to be said: more often than not, seeing people interact at Language Exchange events makes me understand that most people don’t really know how to learn languages very well, for multiple reasons. I’ll go into why shortly.

If you attend a language exchange social event, the odds are heavily stacked in your favor if you want to learn (1) the local language (e.g. if you’re in Iceland, you’ll have many opportunities to learn Icelandic with natives, given as they’ll be the most commonly represented demographic) and (2) English (even if it isn’t the local language).

But concerning someone who wants to learn Mandarin or French and only speak a little bit of that and nothing else but English? You’re going to need to read this…because otherwise you may leave that event broken and discouraged, not also to mention demotivated from ever returning.

Now, you’ve come here for the biggest mistake, so here it is:

The biggest mistake that people make at Language Social Events is not seeking to make gains with their languages when they interact with native speakers.

And EVEN if there are no native speakers of language you want to speak present, feel free to bring some small books along that you can use to play “show and tell”. I did this most recently at an event aimed primarily at learners of Asian Languages (I turned out, not surprisingly, being the only person representing any learner of Southeast Asian Languages. But hey, maybe a Burmese or Lao enthusiast would show and I needed to account for that chance. Besides, I could easily learn about other people’s cultures or even pick up words from languages I haven’t been actively learning).

I had some books on my person and one of them was a Jamaican Patois book. One of my friends who was a Mandarin native speaker didn’t speak Patois and didn’t have any interest in it, but I told him that Chinese languages influenced Jamaican culture in general, showed him the book, read him a few phrases and showed him pictures of Jamaica. That way, I made gains with a language that NO ONE there spoke. I also met someone at a party who was learning Malagasy and HE did very much the same thing to me (despite having no book). I really appreciated it because I have to say I don’t know much about Madagascar at all!

But if you meet native speakers of a language you are actively learning, let me tell you what I most often see versus what you should be doing:

What you should be doing: even if you’re not fluent, ask them to help you put together sentences or even form sentences in your target language while they “feed you words” (they’ll be happy to do this, I’ve done it with English and even with other languages I’m fluent in like Norwegian with other learners). Also ask them to provide details about their language as well as sentences or cultural tidbits that are likely to impress the NEXT native-speaker you meet.

What a lot of people do instead: ask small talk questions only using English. Use a handful of pre-programmed sentences in their target language(s) and spend most of the time using English instead. Use language exchange events as a means to flirt rather than to actually rehearse languages.

The primary key is that you leave having gained something. That something could be cultural know-how, phrases that will help you put together sentences better, or tips on improving your accent. You can even make gains with languages you aren’t actively learning! (I know because I’ve done this with languages like Japanese that I’m not learning at the moment nor do I have any plans to in the immediate future. I’ve also taught people basic phrases in languages like Burmese and Norwegian that they may never see themselves learning at all).

And now one thing I would consider: even if you intend to focus only on one language, I would recommend learning at least a LITTLE bit of a variety of other languages (feel free to do this even if you have no intention to learn them to fluency). This way, you’ll actually be able to start conversations more easily.

If you’re the only one who knows any Khmer, Oromo or Danish, you’ll have people asking you about it even if they have no intention to learn the language themselves. Even if you speak only a LITTLE bit, you can actually be the “local authority” on that language (as I’ve done WAAAAY too often).

You can even use this as a means to learn how to “teach” through an L2 you’ve been working on (and you may discover vocabulary gaps along the way). Most people who show up to these events are curious people and this is even MORE true if it’s a paid event.

A lot of people use English (or English + their native language) 5/6th of the time at language exchange events and wonder why they’re not making gains and why other learners are overtaking them. It isn’t about raw intelligence, it’s about the fact that language learners that put more in get more out. And you have to put effort in from EVERYWHERE in EVERY area of your life if you want the coveted prize of “near-native fluency” or even anything close to it.

Don’t enter without a plan as to what you want and how you’ll get it. Yes, I know you can’t control who will show up (maybe that Finnish speaker will be there, or maybe there won’t be anyone with whom to practice! Who knows?) But you should prepare for a wide range of situations based on what you’ve read about the event series and how you’ve experienced it before in the past.

For most language exchange events in New York City, I’ll expect to use the Romance Languages with regularity. Speakers of Chinese languages, especially Mandarin and Cantonese, will be present with consistency, alongside speakers of Russian, Japanese, Korean, Turkish, languages from throughout South Asia and Arabic dialects that will usually lean towards Egypt and the rest of North Africa. Somewhat rarer than that but still frequent are Hebrew, Polish, Ukrainian, Yiddish and Persian Languages. Rarer still but showing up about once every two months or so are speakers of Nordic Languages, Turkic Languages of Central Asia (such as Kazakh and Uyghur) and languages of Southeast Asia. The rarest that I’ve encountered are speakers of African Languages, usually from South Africa and Ethiopia. Only once or twice have I encountered speakers of native languages of the Americas. I have never encountered anyone from Oceania at any language exchange event to date.

So think about who you encounter frequently and develop plans for what languages you KNOW you will practice there, what languages you are LIKELY to, and which languages you will probably NOT practice, but would LIKE TO.

Tl;dr always make gains with your L2 whenever you speak to a native speaker. Even if you’re not fluent, you can make those gains. The key is to get SOME progress on your language-learning, and you can always do that.

Have a good weekend!

My First Post of 2018: Looking Inside My Soul (+Happy Birthday, Slovakia!)

HAPPY NEW YEAR!

Let’s just do the lazy thing and get the list of goals for 2018 over with. Yes, it’s large, but I set very high standards for myself. Even if I don’t make them, I’ll ensure that I’ll still do very, very well!

  • Master Hungarian, Lao and Greenlandic (B2 or higher)
  • Get the Scandinavian Languages to C2 (understanding virtually EVERYTHING written or spoken)
  • Make significant gains with Hebrew, Finnish, French, Breton, Icelandic, Jamaican Patois and Sierra Leone Creole.
  • Gilbertese and Uyghur at B1 or higher
  • Learn Comorian to A1 at least.
  • Vincentian and Antiguan Creoles at C1 or higher
  • Brush off Russian, Irish, Cornish and Ukrainian (B2 in them would be great!)
  • Tongan, Palauan, Mossi, Welsh, Persian and several Indian languages to A2 or higher.
  • Learn Swahili, Khmer, Haitian Creole, Basque, Fijian and Fiji Hindi in earnest.
  • Colloquial Arabic dialects (esp. Sudanese) to A2
  • Diversify my language practicing materials.
  • Gloss articles in languages I speak and read and put versions of them online for learners making them “learner-friendly”.
  • Continue that same work of throwing away limiting beliefs and practice all of my languages for 3 minutes a day at least one day a week.
  • Come out with a new polyglot video every season (Winter / Spring / Summer / Autumn). They don’t have to showcase ALL of my languages at once, but at least show something.
  • Start a “Coalition Blog” with folks like Kevin Fei Sun, Miguel N. Ariza and Allan Chin and … anyone else I forgot! Guests welcome!

Also, no new languages for 2018. I will make exceptions for picking up new languages for travel, business purposes or relationships that sprout up as a result of various happenings.

Anyhow, with each passing year it occurs to me that what becomes more and more important is not so much learning new words and expressions but rather developing mental strategies.

I could be fluent in a language but if I’m in a negative headspace words will elude me. I’m certain that anyone reading this has also had them happen when speaking their NATIVE LANGUAGE.

Anyhow, here are some difficulties I’ve been noticing;

  • I remember from “Pirkei Avot” (a Jewish text about ethics and life in general that I’ve periodically mentioned on this site) that it is said that “the reward for a good deed is another good deed, and the reward for a bad deed is another bad deed”. Namely, positive feedback ensures that you’re likely to continue to speak and act in your most optimal manner, and negative feedback will drag you down in a similar way.

I’ve noticed this at Mundo Lingo. I speak the Scandinavian Languages “very, very well” (that’s what Richard Simcott told me, so I believe him). So when there’s a Swedish native speaker who shows up, I’m in a good head-space and then I speak languages that I usually am not so good at (French, for example) better than I normally do.

 

On the other hand, sometimes I’ve heard racist comments at Mundo Lingo (yes, it does happen!) Or people disparaging me for my choice of languages. As a result, I’m in no good headspace to do anything, because it feels like I’ve been “wounded” and will act accordingly.

 

I think one way to counter this is to usually start the day with some good feedback. One of my New Year’s Resolutions was to post daily in a closed group called “Polyglot Polls” (you can join if you’d like! Just let me know) Given that a lot of open-minded and curious people are in that group, ones who mutually support each other with their missions, it helps put me in a good headspace. It is a good thing to start any day with.

 

  • Imposter syndrome in the polyglot community runs a bit like a fear of turning out like Ziad Fazah, the polyglot who claimed to fluently speak 59 languages and, on live television…well, he was asked what day of the week it was in Russian and said that he couldn’t understand it because it was Croatian.

 

Only this past weekend I was asked to count to ten in Tongan (a language that I am weak at) and, sadly, I couldn’t do it. But I don’t claim to speak Tongan fluently. But still I felt down.

 

I think moments like these are good for recognizing my weak points. Even in our native languages, we have them. It’s not a reflection that you’re a fake, it reflects on the fact that you have something that needs patching. That’s what life is. Telling you where you aren’t doing well and bringing you on the path to recovery.

 

Unlike Ziad, I don’t claim to have any divine gift for languages. I just spend a lot of time struggling with things until I get them. The contemporary schooling modules have taught us that learning isn’t supposed to be about struggling. That’s not true in the slightest, certainly not at the advanced levels of anything.

 

  • The last one: sometimes I feel that I’m falling into the trap of thinking that I became a polyglot for the sake of others rather than for my own sake.

Again, to tie in Jewish themes, in studying holy texts and observing ritual we use a phrase “Leshem Shamayim” – literally, “To the name of Heaven”, figuratively, “for heaven’s sake” and more figuratively “doing something for love of the subject-matter rather than for acquiring validation, reputation, praise or any other contemporary form of social currency”.

Every dream chaser has felt poised between doing something “leshem shamayim” and doing something for the sake of personal gain or admiration of others. I have to resist that, now more strongly than ever.

20171027_173837

Professor Alexander Arguelles (right) and yours truly, Jared Gimbel (left)

On a side note, I’d like to wish my Slovak and Slovak-speaking friends a happy Independence Day!

May 2018 be full of blessings, everyone!

Think Human Translators Will Be Replaced By Machines? Not So Fast!

In line with the previous piece about corporate narratives discouraging cultural exploration and language learning, there is a corollary that I hear more often and sadly some people whom I respect very deeply still believe it:

Namely, the idea that translation, along with many other jobs, will be replaced entirely by machines (again, a lot of misinformation that I’m going to get into momentarily)

My father went so far to say that my translation job wouldn’t be around in a few years’ time.

Iso an Jekob

I don’t blame him, he’s just misinformed by op-eds and journalists that seek to further an agenda of continued income inequality rather than actually looking at how machine translation is extremely faulty. After all, fewer people believing that learning languages is lucrative means that fewer people learn languages, right? And money is the sole value of any human being, right?

I am grateful for machine translation, but I see it as a glorified dictionary.

But right now even the most advanced machine translation in the world has hurdles that they haven’t even gotten over, but haven’t even been ADDRESSED.

I will mention this: if machine translation does end up reaching perfection, it will almost certainly be with very politically powerful languages very similar to English first. (The “Duolingo Five” of Spanish, French, Italian, German and Portuguese would be first in line. Other Germanic Languages, with the possible exceptions of Icelandic and Faroese, would be next.)

If the craft “dies” in part, it will be in this sector first (given as it is the “front line”). Even then, I deem it doubtful (although machine translation reaching perfection from English -> Italian is a thousand times more likely than it reaching perfection from English -> Vietnamese) But with most languages in the world, translators have no fear of having their jobs being replaced by machines in the slightest.

Because the less powerful you get and the further you get away from English, the more flaws show up in machine translation.

Let’s hop in:

 

  • Cultural References

 

Take a look at lyricstranslate.com (in which using machine translation is absolutely and completely forbidden). You’ll notice that a significant amount of the song texts come with asterisks, usually ones explaining cultural phenomena that would be familiar to a Russian- or a Finnish-speaker but not to a speaker of the target language. Rap music throughout the world relies heavily on many layers of meaning to a degree in which human translators need to rely on notes. Machine translation doesn’t even DO notes or asterisks.

Also, there’s the case in which names of places or people may be familiar to people who speak one language but not those who speak another. I remember in Stockholm’s Medieval Museum that the English translation rendered the Swedish word “Åbo” (a city known in English and most other languages by its Finnish name “Turku”) as “Turku, a city in southern Finland” (obviously the fluent readers of Scandinavian Languages needed no such clarification).

And then there are the references to religious texts, well-known literature, Internet memes and beyond. In Hebrew and in Modern Greek references to or quotes from ancient texts are common (especially in the political sphere) but machine translation doesn’t pick up on it!

When I put hip-hop song lyrics or a political speech into Google Translate and start to see a significant amount of asterisks and footnotes, then I’ll believe that machine translation is on the verge of taking over. Until then, this is a hole that hasn’t been addressed and anyone who works in translation of cultural texts is aware of it.

 

  • Gendered Speech

In Spanish, adjectives referring to yourself are different depending on your gender. In Hebrew and Arabic, you use different present-tense verb forms depending on your gender as well. In languages like Vietnamese, Burmese, and Japanese different forms of “I” and “you” contain gendered information and plenty of other coded information besides.

What happens with machine translation instead is that there are sexist implications (e.g. languages with a gender-neutral “he/she” pronoun such as Turkic or Finno-Ugric Languages are more likely to assume that doctors are male and secretaries are female).

Machine Translation doesn’t have a gender-meter at all (e.g. pick where “I” am a man, woman or other), so why would I trust it to take jobs away from human translators again?

On that topic, there’s also an issue with…

 

  • Formality (Pronouns)

 

Ah, yes, the pronouns that you use towards kids or the other pronouns you use towards emperors and monks. Welcome to East Asia!

A language like Japanese or Khmer has many articles and modes of address depending on where you are relative to the person or crowd to whom you are speaking.

Use the wrong one and interesting things can happen.

I just went on Google Translate and, as I expected, they boiled down these systems into a pinhead. (Although to their credit, there is a set of “safe” pronouns that can more readily be used, especially as a foreign speaker [students are usually taught one of these to “stick to”, especially if they look non-Asian]).

If I expect a machine to take away a human job, it has to do at least as well. And it seems to have an active knowledge of pronouns in languages like these the way a first-year student would, not like a professional translator with deep knowledge of the language.

A “formality meter” for machine translation would help. And it would also be useful for…

 

  • Formality (Verb Forms)

 

In Finnish the verb “to be” will conjugate differently if you want to speak colloquially (puhekieli). In addition to that, pronouns will also change significantly (and will become shorter). There was this one time I encountered a student who had read Finnish grammar books at length and had a great knowledge of the formal language but NONE of the informal language that’s regularly used in Finnish-Language vlogging and popular music.

Sometimes it goes well beyond the verbs. Samoan and Fijian have different modes of speaking as well (and usually one is used for foreigners and one for insiders). There’s Samoan in Google Translate (and Samoan has an exclusive and inclusive “we” and Google Translate does as well with that as you would expect). I’m not studying Samoan at the moment, nor have I even begun, but let me know if you have any knowledge of Samoan and if it manages to straddle the various forms of the language in a way that would be useful for an outsider. I’ll be waiting…

 

  • Difficult Transliterations

 

One Hebrew word without vowels can be vowelized in many different ways and with different meanings. Burmese transliteration is not user-friendly in the slightest. Persian and Urdu don’t even have it.

If I expect a machine to take my job, I expect it to render one alphabet to another. Without issues.

 

  • Translation Databases Rely on User Input

 

This obviously favors the politically powerful languages, especially those from Europe. Google Translate’s machine learning relies on input from the translator community. I’ve seen even extremely strange phrases approved by the community in a language like Spanish. While I’ve seen approved phrases in languages like Yiddish or Lao, they’re sparse (and even for the most basic words or small essential phrases).

In order for machine translation to be good, you need lots of people putting in phrases into the machine. The people who are putting phrases in the machine are those with access to computers, not ones who make $2 a day.

In San Francisco speakers of many languages throughout Asia are in demand for being interpreters. A lot of these languages come from poor regions that can’t send a bunch of people submitting phrases into Google Translate to Silicon Valley.

What’s more, there’s the issue of government support (e.g. Wales put its governmental bilingual documents into Google Translate, resulting in Welsh being better off with machine translation that Irish. The Nordic Countries want to preserve their languages and have been investing everything technological to keep them safe. Authoritarian regimes might not have the time or the energy to promote their languages on a global scale. Then again, you also get authoritarian regimes like Vietnam with huge communities of expatriates that make tech support of the language readily available in a way that would make thousands of languages throughout the world jealous).

 

  • Developing World Languages Are Not as Developed in Machine Translation

 

Solomon Islands Pijin would probably be easier to manage in machine translation that Spanish, but it hasn’t even been touched (as far as I know). A lot of languages are behind, and these are languages spoken in poor rural areas in which translators and interpreters are necessary (my parents worked in refugee camps in Sudan, you have NO IDEA how much interpreters of Tigre were sought after! To the degree in which charlatans became “improvisational interpreters”, you can guess how long that lasted.)

Yes, English may be the official language of a lot of countries in Africa and in the Pacific (not also to mention India) but huge swathes of people living here have weak command of English or, sometimes, no command.

The Peace Corps in particular has tons of resources for learning languages that it equips its volunteers with. Missionaries also have similar programs as well. Suffice it to say that these organizations are doing work with languages (spanning all continents) on a very deep level where machine translation hasn’t even VENTURED!

 

  • A Good Deal of Languages Haven’t Been Touched with Machine Translation At All

 

And some of this may also be in part due to the fact that some of them have no written format, or no standardized written format (e.g. Jamaican Patois).

 

  • Text-To-Speech Underdeveloped in Most Languages

 

I’m fairly impressed by Thai’s Text-to-Speech functionality in Google Translation, not also to mention those of the various European Languages that have them (did you know that if you put an English text into Dutch Google Translate and have it read out loud, it will read you English with a Dutch accent? No, really!)

 

And then you have Irish which has three different modes of pronunciation in addition to a hodge-podge “standard” that is mostly taught in schools and in apps. There is text-to-speech Irish out there, developed in Trinity College Dublin, It comes in multiple “flavors” depending on whether you want Connacht, Ulster or Munster Irish. While that technology exists, it hasn’t been integrated into Google Translate in part because I think customization options are scary for ordinary users (although more of them may come in the future, can’t say I know because I’m not on the development team).

 

For Lao, Persian, and a lot of Indian regional languages (among many others), text-to-speech hasn’t even been tried. In order to fully replace interpreters, machine translation NEEDS that and needs it PERFECTLY. (And here I am stuck with a Google Translate that routinely struggles with Hebrew vowelization…)

 

  • Parts of Speech Commonly Omitted in Comparison to Other Languages

 

Some languages, like Burmese or Japanese, often form sentences without any variety of pronoun in the most natural way of speech. Instead of saying “I understand” in Burmese, you would literally say “ear go-around present-tense-marker” (no “I”, although you could add a version of “I” and it would still make sense). In context, I could use that EXACT same phrase as the ear going around to indicate “you understand” “we understand” “the person behind the counter understands”.

In English, except in the very informal registers (“got it!”) we usually need to include a pronoun. But if machine translation should be good enough to use in sworn interviews and in legal proceedings, they should be able to manage when to use pronouns and when not to. Even in a language like Spanish adding “yo” (I) versus omitting it is another delicate game to play, as is the case with most languages in which person-information is coded into the verb (yo soy – I am, but soy could also mean “I am” as well)

Now take a language like Rapa Nui (“Easter Island Language”). Conjunctions usually aren’t used (their “but” comes from Spanish as a loan word! [pero]). Now let’s say a machine has to translate from Rapa Nui into English, how will the “and” ‘s and “but” ‘s be rendered in a way that is natural to an English speaker?

 

Maybe the future will prove me wrong and machine translation will be used in courts instead of human beings. But I’ll come closer to believing it when these ten points are done away with SQUARELY. Until then, I’ll be very skeptical and assure the translators of the world that they are safe in their profession.

 

 

ga

“Is Tajikistan a Real Country?” – Introducing the Tajiki Language

Happy Persian New Year!

забони тоҷикӣ.jpg

 

The most money I’ve ever spent on a language learning book. Came with a CD. Can’t imagine there are too many books that can say that about themselves in 2017.

 

In Late 2016 and Early 2017 I thought it would be becoming of me to try to learn a language of a Muslim-majority country for the first time. Yes, I did get the Turkish trophy in Duolingo but I don’t count that because the amount of Turkish phrases I can say as of the time of writing can be counted on my fingers.

The same way that the Catholic world is very varied (you have Brazilians and Hungarians and Mexicans and too many nations in Sub-Saharan Africa to list), the Muslim world is just as equally varied with numerous flavors and internal conflicts that Hollywood and American pop culture not only doesn’t show very often but actively tries to hide (or so I feel).

While I am not fluent (nor do I even count myself as proficient) in Tajik, I am grateful for the fact that I can experience tidbits of this culture while being very far away from it, and it seems oddly familiar to me for reasons I can’t quite explain.

What’s more, Tajik is one of three Persian languages (the others being Farsi in Iran and Dari in Afghanistan), and so I can converse with speakers of all three with what little I have. I remember being shocked about how close Swedish, Norwegian and Danish were to each other (to those unaware: even closer than Spanish, Catalan and Italian), and I was even more shocked at how close these were. The three Persian languages are even closer—so close that there are those (both on the Internet and in my friend group) that consider them dialects of a single language (yes, I’ve had the same discussion with the Melanesian Creole languages!)

As a Jewish person myself (and an Ashkenazi Jew at that, for those unaware that means that my Jewish roots are traced to Central-Eastern Europe), I was intrigued by Tajik in particular as the language of the Bukharan Jewish community.

(Note: Bukhara is in contemporary Uzbekistan, and if you see where Uzbekistan, Kyrgyzstan and Tajikistan meet on a map and you have a hunch that imperialist meddling may have been responsible for those borders, then you’re absolutely right!)

What’s more, my father visited Iran and Afghanistan earlier in his life but when he was there the USSR was “still a thing”.

I also had a fascination with Central Asia as a teenager ever since I heard the words “Kazakhstan”, “Uzbekistan”, “Tajikistan”, etc. (despite the fact that I literally knew NOTHING about these places aside from their names, locations on a map, and capitals), and so between Persian languages I knew which one I would try first.

It has been hard, though! With Tajik I’ve noticed that there is a gap in online resources—a lot of stuff for beginners and for native speakers (e.g. online movies) and virtually NOTHING in between (save for the Transparent Language course that I’m working on).

Thankfully knowing that I have surmounted similar obstacles with other languages (e.g. with Solomon Islands Pijin) fills me with determination.

2016-10-31-19-21-52

 

I’m sorry. No more “Undertale” jokes for a while.

 

Anyhow, what make Tajik unique?

 

  1. Tajik is Sovietized

 

The obvious difference between the other Persian languages and Tajik is the fact that Tajik is written in the Cyrillic alphabet, and much like Hebrew or Finnish, is pronounced the way that it is written with almost mathematical precision (despite some difficult-to-intuit shenanigans with syllable stress).

 

Thanks to not using the Arabic alphabet this obviously does make it a lot easier for speakers who may not be familiar with it.

 

Yes, in a lot of the countries in Central Asia (especially in Turkmenistan and Uzbekistan) there are some issues with what alphabet is used (and if you think that this has to do with dictators forcing or adopting certain systems, you’d be right!). Tajik I’ve noted is very consistent in usage of the Cyrillic alphabet, although obviously presences of the other two Persian languages e.g. on comment boards are present almost always whenever Tajik is.

 

But what exactly does “sovietization” entail? Well there are a lot of words that come from Russian in Tajik, and ones that were probably adopted because of administrative purposes. The words for an accident ( “avariya”) and toilet (“unitaz”), for example, are Russian loan words.

 

But unlike the Arabic / Turkic words in Tajik, a lot of these loan words refer specially to objects and things related to administration (the concept of the “Familia” [=family name], for example).

 

And this brings us to…

 

 

  1. There are a lot of Arabic loan words in Tajik.

This is something that is common to many languages spoken by Muslims.  As I noted in my interview with Tomedes, it occurred to me that the usage of Arabic words in a language like Tajik very eerily paralleled the usage of Hebrew words in Yiddish. Yiddish uses a Hebrew greeting frequently (Shulem-Aleikhem! / Aleikhem Shulem!), and Tajik uses its Arabic equivalent (Salom! / Assalomu alejkum! / va alajkum assalom!).

In case you are curious as to why the “o” is used in Tajik in the Arabic-loan phrase above, this has to do with the way that these words mutated when they entered Tajik, the same way that (wait for it!) Hebrew words changed their pronunciation a bit when they entered Yiddish! (Yaakov [Jacob] becoming “Yankev”, for example)

These Arabic loan words found themselves not only in the other Persian languages but also through Central Asia and in the Indo-Aryan languages (spoken in Northern India)!

 

  1. Tajik uses pronouns to indicate possessives

 

Should probably clarify this with an example:

Nomi man Jared (my name is Jared)

Kitobi shumo (Your book)

Zaboni Tojiki (Tajik Language)

 

Man = I

Shumo = you (polite form)

 

This means that forming possessives because easy once you grasp the concept of Izofat.

Cue the Tajiki Language book in the picture above (on page 135, to be precise)

 

“Izofat is used to connect a noun to any word that modifies it except numbers, demonstratives the superlative form of adjectives and a few other words. It consists of “I” following the noun and is always written joined to the noun. It is never stressed, the stress remains on the last syllable of the noun

 

Kitobi nav – a new book”

Madri khub – a good man

Zani zebo – a beautiful woman

Donishjui khasta = a tired student”

 

(And this is the point when it occurs to you that “Tajiki”, the name used of the language by some, uses Izofat. Tajik = person, Tajiki = language or general adjective, although enough people don’t make the distinction to the degree that even Google Translate refers to the language as “Tajik”)

Thanks to Izofat, a lot of the words are not extraordinarily long (much like in English), sparing you the pains of a language like German or Finnish (much less something like Greenlandic) in which a word may require you to dissect it.

 

  1. Hearing Tajik can be an Enchanting Experience for Those Who Know Iranian Persian or Dari

 

Ever heard someone with a stark generational difference to you use a word you can recognize but don’t use? (for me in my 20’s, this means someone using the word “billfolds” to refer to your wallets, “marks” for your grades, etc?)

 

In using my Tajik with speakers of the other two Persian languages, I’ve often heard “that makes sense to me, and its correct, but it has fallen out of usage in my country”, a bit like you might be able to understand idioms of Irish English or English as spoken in many Caribbean island nations, although you might not be able to use them yourself…including some you actually legitimately don’t know!

 

Unlike with, let’s say, speaking Danish to a Swedish person (did that only ONCE!) and not being understood, I haven’t had problems being understood in Tajik, although I usually have to explain why I speak Tajik and not Farsi (answer: curiosity + my father didn’t get to visit there, but maybe I will! + Central Asia and the -stan countries are KEWL

 

I would write more about how to learn it and how to use it, but the truth is that I’m sorta still a novice at Tajik, so maybe now’s not the best time.

But hey! September is Tajikistan’s independence day, so if I progress enough by then you’ll get treated to something!

Soli nav muborak! A Happy New Year!