tlhIngan-Hol Archive: Tue Mar 13 07:53:28 2007

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

RE: Klingon Voice Translator for CES 2008

Robert Palmquist ([email protected])



Thanks for your input. There are two prominent approaches to developing
machine translators: rule-based and statistical-based. You are
describing some of the issues in using a rule-based implementation.
There's a debate as to which is better. My opinion is that in general,
if you have the time and resources to commit to the effort (which can be
substantial), a rule-based approach will perform better in terms of
accurate retention of meaning. The advantage of the statistical approach
is that if you have a large database of source-target data, then you can
quickly generate a workable translation engine. In this particular
instance, our plan is to use a statistical-based approach where existing
source (e.g., English) and target (e.g., Klingon) data is used to
generate statistical probabilities that a set of word groupings in the
source is matched to a set of word groupings in the target. A one-gram
grouping would simply be a word-for-word substitution which is basically
useless ("I changed my mind" could become "I modified my brain"). A
five-gram approach, which is about as large of a computational space as
I've heard anyone perform, requires massive computational power to
process (Goggle's done some crunching in this arena). Depending on the
amount of source-target data we can get, we'd probably use a two-gram
approach. This would produce some meaningful output, but certainly not
perfect. For example, "This food is safe to eat" is statistically close
to "This food is not safe to eat." A rule-based system will ensure that
the meaning of the sentence is retained, whereas a statistical output
may invert meanings. 

All of the above is very much a simplification of a heated debate
ongoing in the translation community. There's a battle going on between
the folks that have spent thousands of person-hours putting together
rule-based systems versus the much more recent approach of using
statistics. In reality, the two approaches are usually combined to
generate the output.

So, back to the start. If someone is interested in translating our basic
500 sentences at a rate of 7 cents per word, please let me know and I'll
forward the phrases your direction. That will enable us to put together
a very simplistic system. Likewise, if you can forward us any links or
such to source-target data, we'll run it through our statistical
processor and see what happens. It certainly will not be perfect, but
perhaps it'll produce some decent results.

Thanks,
-- Robert
(www.speechgear.com)


-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Doq
Sent: Monday, March 12, 2007 3:14 PM
To: [email protected]
Subject: Re: Klingon Voice Translator for CES 2008

One thing you might consider when writing this translator: Klingon  
uses suffixes to modify the meanings of words, so the 2,800 or so  
words the language has actually can be modified to have far more words.

The vocabulary includes some of them, like {ghoj} means "learn" and  
{ghojmoH} means "teach", but in fact, {ghojmoH} is actually {ghoj}  
("teach") plus {-moH} (causation), so it literally means "cause to  
learn".

This implies also that Klingon sometimes splits words in places  
English doesn't, like {vIH}, which means "move, be in motion". In  
English, "move" is both transitive and intransitive. In Klingon,  
{vIH} is intransitive. To make it transitive, you use {-moH}. So when  
I move {jIvIH}, that's quite different from "I move the chair", which  
is {quS vIvIHmoH.}

As a human translating from English to Klingon, the process I go  
through is to freeze the English and study the meaning behind it and  
then break it out into the grammatical parts and then try to find,  
among the tools of Klingon grammar, the best one to express the  
meaning of the English. Sometimes, that means abandoning mechanical  
formulae altogether.

As an example, Klingon forms questions by beginning with what would  
be the answer and replacing the noun or adverb that is unknown with a  
question word. English does much the same with questions like "Who  
are you?" or "You will arrive when?" Meanwhile, English has the word  
"which" as an adjectival question word. "Which chair do you want?"  
There is no equivalent Klingon question word.

If you want to translate "Which chair do you want?" you can only do  
something like {quSlIj yIwIv.} That means "Choose your chair." You  
can similarly use verbs like {wuq} "decide upon" or {ngu'}  
"identify". You must change the question into a command.

Several linguists with programming experience have attempted to write  
translation programs in Klingon. None have succeeded. It is very  
tempting, but no one has had the luxury of enough time to fully  
explore the possibility. I think I could, given enough time,  
programmatically translate from Klingon to English, but not the other  
way around. English is too free form with too large a vocabulary and  
too many subtle differences in meaning associated with the very  
flexible word order.

1. Only I hit the baby in the head.
2. I only hit the baby in the head.
3. I hit only the baby in the head.
4. I hit the only baby in the head.
5. I hit the baby only in the head.
6. I hit the baby in only the head.
7. I hit the baby in the only head.
8. I hit the baby in the head only.

5, 6 and 8 have the same meaning. Each of the others are unique in  
their meaning. Klingon has no parallels to this. Most word order is  
fixed and where it is variable, it does not change the meaning to  
vary the word order. There are ambiguities that can be clarified with  
word order, but that's just to resolve episodes where a word might  
otherwise be ambiguously part of the phrase before it or after it,  
unless it is moved away from that union.

But I digress...

Doq

On Mar 12, 2007, at 3:05 PM, Robert Palmquist wrote:

> Not sure if I'm stomping on any etiquette rules here -- I apologize  
> if I am. I just wanted to throw this out to the group. My company  
> has developed a suite of products for instant language translation.  
> As part of the upcoming Consumer Electronics Show (CES - January  
> 2008 in Las Vegas), we'd like to include Klingon as one of our  
> supported languages. We'll be doing this in partnership with the  
> Hilton Hotel's "Star Trek - The Experience," (see http:// 
> www.startrekexp.com/) for more info). We'll have a Press Event at  
> the Hilton, plus have the "Experience" personnel demo products in  
> our booth during CES. In terms of copyrights and such, this is  
> meant to be a publicity event versus a product that we'll sell on  
> the market. If we can get copyright permission, we'll give away the  
> software to anyone that's interested, if not, then it'll just be  
> for demonstrations and such. If you're interested in helping out on  
> the project, there are two items we'd need:
>
> (1) We have a database of about 500 phrases that we need  
> translated, plus a voice recording of someone speaking the phrase.  
> All told, it's around 4000 words and we pay 7 cents per word. One  
> person does not have to translate all the phrases, we can split  
> them up. These are general phrases that a tourist would state.
> (2) We'd like to have any source-target data that currently exists:  
> source being English words or phrases, and the target being the  
> Klingon translation. This can be from the Dilbert strips, the  
> bible, or any other source. We'll use it to create a statistical- 
> based engine -- the more data the better the engine will be.
>
> With that, we could come up with a basic English <> Klingon system.  
> If you're interested, please let me know. Again, we're talking  
> about making this happen by January of next year, so there's quite  
> a bit of time to put it all together.
>
>
> Thanks,
> -- Robert
>
> Robert Palmquist
> SpeechGear, Inc.
> 516 West Fifth Street
> Northfield, MN  55057
>
> t.  507-664-9123 ext. 210
> c. 612-232-6666
> f.  775-703-6730
> e. [email protected]
> w.  www.speechgear.com
>
>
>
>
>









Back to archive top level