Sunday, October 18, 2015

Word order and syntax. They matter.

So, a translator friend alerted me to this list of the five best language translation tools and noted that both Bing and Google Translate are on it. (Two of the nominated apps seem to be mainly for signs and menus and the like.) Well, if you read my blog you know how I feel about both of those tools. And here's another short example.

Mikheil Saakashvili, the ex-president of Georgia who's now governor of the Odesa Oblast in Ukraine, has a habit of posting to his Facebook page in both Ukrainian and Russian. Here's something he wrote yesterday (pics at the bottom of the post). Each version is 7 sentences, 64 words in all:
Сьогодні мав честь вручити нагороду «Народний Герой України» приголомшливим людям. Серед них Мустафа Джамільов - великий татарин і патріот України. Мене вразив Вадим Довгорук. Йому в Дебальцевому гранатою відірвало руку, а потім лікарі були змушені ампутувати обидві ноги через обмороження. Він залишився служити в ВС і пішов вчитися на військового психолога. Дивовижно сміливий, життєрадісний хлопець. Народ, у якого є такі герої, ніколи не перемогти.

Сегодня имел честь вручить награду «Народный Герой Украины» потрясающим людям. Среди них Мустафа Джамилев – великий татарин и большой патриот Украины. Меня поразил Вадим Довгорук. Ему в Дебальцево гранатой оторвало руку, а потом врачи были вынуждены ампутировать обе ноги из-за обморожения. Он остался служить в ВС и пошел учиться на военного психолога. Удивительно смелый, жизнерадостный парень. Народ, у которого есть такие герои, никогда не победить.
Both programs had a terrible time with this little text. Depending on how you feel about translating proper nouns, there are either 12 or 15 major errors, and several are serious enough to make the translation simply unusable.

Bing labeled its efforts "Partly translated by Bing" as several words were beyond it:
Today had the honor to award "national hero of Ukraine» terrific people. Among them is Mustafa Džamìl′ov-the great Tatar and patriot of Ukraine. Impressed Vadim Dovgoruk. In Debal′cevomu a grenade shot the arm, and then the doctors were forced to amputate both legs due to frostbite. He left to serve in the US and went to study at the military psychologist. Amazingly daring, cheerful guy. People who are heroes, never win.

Today had the honour to give the award of the "people's hero of Ukraine" amazing people. Among them, Mustafa Джамилев-great tartar and a great patriot of Ukraine. I was struck by the vadim довгорук. Him in debaltseve grenade lost his arm, and then the doctors were forced to amputate both legs because of frostbite. He stayed to serve in the sun and went to learn from the military psychologist. Surprisingly brave, cheerful guy. The people, which has such heroes, never going to win.
And here's how Google handles those paragraphs:
Today had the honor to present the award "National Hero of Ukraine" terrific people. Mustafa Dzhamilov Among them - a great patriot Tartar and Ukraine. I was struck by Vadim Dolgoruky. Debaltseve grenade into his hand blown off, and doctors had to amputate both legs due to frostbite. He left to serve in the Armed Forces and went to study at a military psychologist. Wonderfully bold, cheerful guy. People, who are these heroes never win.

Today had the honor of presenting the award "People's Hero of Ukraine" terrific people. Among them, Mustafa Jamil - the great Tartar and a great patriot of Ukraine. I was struck by Vadim Dovgoruk. Him Debalcevo grenade severed hand, and then the doctors were forced to amputate both legs due to frostbite. He was to serve in the Armed Forces and went to study at a military psychologist. Surprisingly bold, cheerful guy. People that has such heroes, never win.
Here we go:
  1. Сьогодні мав честь / Сегодня имел честь : This is perhaps the most minor of the errors. They have not restored the personal pronoun, which is easily and routinely dropped in East Slavic languages ("today I had the honor").

  2. вручити нагороду «Народний Герой України» приголомшливим людям / вручить награду «Народный Герой Украины» потрясающим людям : They handled "present the award" in various ways, all acceptable, but then they both ignored the case ending on людям. It's in the dative, so it should be "to (amazing/terrific) people".

  3. Мустафа Джамільов / Мустафа Джамилев: his name is generally transliterated as Mustafa Dzhemilev. Between them they offer us Džamìl′ov, Dzhamilov, Jamil, and Джамилев.

  4. татарин is a Tatar. "Tartar" is an old-fashioned variant.

  5. великий татарин і патріот України / великий татарин и большой патриот Украины. Much worse is how Google handles the Ukrainian phrase that Tatar is in: a great patriot Tartar and Ukraine. First they move "patriot" where it doesn't belong, and then they ignore the case ending, turning "of Ukraine" into a bare "Ukraine". A note: in Russian, Saakashvili uses two different words for "great" - the first one might best be translated as eminent, though "great" will certainly do. Ukrainian uses the same word for both senses ("great, eminent" and "great, huge"). At any rate, Dzhemilev is being described as two things: a great/eminent Tatar and a huge patriot of Ukraine (or "Ukrainian patriot").

  6. Мене вразив Вадим Довгорук. / Меня поразил Вадим Довгорук. Both languages use the same syntax here - OVS. It's a common way to stress the subject; English will use the passive. I was impressed by Vadym Dovhoruk. Google gets that right; Bing does not. From Ukrainian it simply drops the "me" and gives us Impressed Vadim Dovgoruk, which turns him into the one being impressed; from the Russian, while it understands the syntax, it doesn't seem to know it's a name, plopping in an article and ignoring the capitalization, and doesn't even transliterate it (I was struck by the vadim довгорук.)

  7. Вадим Довгорук itself becomes Vadim Dovgoruk / the vadim довгорук / Vadim Dolgoruky. Normally, from Ukrainian this name is Vadym Dovhoruk, while Russians render it as Vadim Dovgoruk - "dovh-" is the same root as "dolg-", and Dolgoruky is certainly a name that exists, but it's not this man's name.

  8. Йому в Дебальцевому гранатою відірвало руку / Ему в Дебальцево гранатой оторвало руку. Again, unsurprisingly, the syntax is the same in the sentences (Russian and Ukrainian are both East Slavic languages), and it defeats them both. Bing offers us In Debal′cevomu a grenade shot the arm / Him in debaltseve grenade lost his arm and Google Debaltseve grenade into his hand blown off / Him Debalcevo grenade severed hand. This time it's Google I'm not sure knows Debaltsevo is a placename, despite the preposition "in" - it looks more like a brand of grenade, while Bing doesn't drop the case ending (-omu). The syntax of the dative pronoun instead of a possessive ("to him the arm" instead of "his arm"), which is utterly standard, left both applications floundering in confusion. And both of them fail to render this as the instrument it clearly is (instrumental case!)

  9. Дебальцевому / Дебальцево itself is transliterated as Debal′cevomu / debaltseve / Debaltseve / Debalcevo

  10. Він залишився служити / Он остался служить, which is pretty simple ("he remained serving (or "in service")") comes out as He left to serve from the Ukrainian by them both, and He was to serve from the Russian by Google. Bing got that part right from the Russian, but went wildly off the rails with the next clause.

  11. в ВС. This abbreviation, which is very common, floored Bing in both languages. Google got it right ("in the Armed Forces") but Bing offers "in the US" from Ukrainian (I think because "U" and "V" are frequently interchanged, but "US" isn't "US" in Ukrainian, it's SShA (США). Where Bing got "in the sun" from the Russian I have no idea.

  12. і пішов вчитися на військового психолога / и пошел учиться на военного психолога caused no end of grief, too. Bing looks at the preposition but ignores the case in Ukrainian and does the reverse in Russian ("at the military psychologist" and "from the military psychologist"); Google looks at the preposition alone in both ("at a military psychologist"). This particular preposition can take either the locative or the accusative, and here it's the latter. That means it's not "at". What we have is a common idiom meaning "to study to be, to train as".

  13. Дивовижно сміливий, життєрадісний хлопець. / Удивительно смелый, жизнерадостный парень. This is a sentence fragment. Leaving it as such (Amazingly daring, cheerful guy. / Surprisingly brave, cheerful guy. / Wonderfully bold, cheerful guy. / Surprisingly bold, cheerful guy.) is a minor error, if an error at all, but I think "a(n)" is called for even if you're not going to put "He's".

  14. Народ, у якого є такі герої / Народ, у которого есть такие герои For some reason, they both mess up the relative clause (including their comma use) in the Ukrainain: "People who are heroes" from Bing, which ignores the такі, and "People, who are these heroes" from Google which doesn't but gets it wrong. It's "such". Also, they both think народ is "people", meaning the plural of "person", instead of "people" meaning "a nation". In Russian they recognize that the у construction is a possessive (The people, which has such heroes / People that has such heroes) and that it's "the people". However, they've both butchered the main clause:

  15. Народ, у якого є такі герої, ніколи не перемогти. / Народ, у которого есть такие герои, никогда не победить. This final sentence is butchered the same way by both of them in the crucial main clause. Народ is an inanimate noun, meaning its nominative and accusative cases look the same. Both programs treat it as the subject of the main verb, but it's not; it's the object. The main verb is an infinitive. Infinitives have a quasi-modal sense, and don't have subjects: the best way to translate them into English is either a passive or a "one can(not)". Bing offers us People who are heroes, never win / The people, which has such heroes, never going to win, and Google serves up People, who are these heroes never win / People that has such heroes, never win. None of those are right; all of them are completely wrong, reversing entirely Saakashvili's statement: "A people that has such heroes can never be beaten".

So, in short, both Bing and Google Translate mangle six of these seven sentences, none of which are particularly difficult, and manage to make Saakashvili assert the direct opposite of what he actually said. If these are two of the five best, machine translation has a long way to go. Here's how it ought to go:
Today I had the honor to present the award "National Hero of Ukraine" to some amazing people. Among them is Mustafa Dzhemilev, the great Tatar and Ukrainian patriot. I was impressed by Vadym Dovhoruk. At Debaltsevo his hand was torn off by a grenade, and then the doctors were forced to amputate both of his feet due to frostbite. He continued serving in the armed forces, studying to become a military psychologist. He's a remarkably brave and cheerful guy. A nation that has such heroes will never be defeated.

Labels: , , , ,

2 Comments:

At 2:11 PM, October 18, 2015 Anonymous Kathie had this to say...

Sheesh, the sample isn't even complex or philosophical prose, nor poetry -- just narrative. I doubt computers will catch up to human brain-power any time soon, because there's too much decision-making required in the process of translating expository writing (many programs) and interpreting speech (now Skype, supposedly). Of course, this is beneficial to our line of work :-)

Apps used mainly for translating signs and menus are merely cyber equivalents of old-fashioned pocket-sized phrase books, only faster (but not inherently more accurate than a translating dictionary).

It's so annoying when media proclaim the arrival of reliable translating software -- which they've been doing repeatedly for at least the past decade -- as I suspect they've never seriously tested these programs, nor had experts do so. Sigh...

 
At 3:35 PM, October 18, 2015 Anonymous Mark P had this to say...

I would be amazed (or maybe some day will be amazed) if a computer could translate particularly well. A friend who can speak Spanish sent me an email with a message from his Cuban friend. It was only about two sentences long. I have extremely rudimentary Spanish, and I was not able to make any sense out of it with an English-Spanish dictionary.

 

Post a Comment

Subscribe to Post Comments [Atom]

     <-- Older Post                     ^ Home                    Newer Post -->