tlhIngan-Hol Archive: Sun Nov 22 09:01:47 2009

Back to archive top level

To this year's listing

[Date Prev][Date Next][Thread Prev][Thread Next]

Re: The topic marker -'e'

Tracy Canfield ([email protected])

From: Tracy Canfield <[email protected]>
Subject: Re: The topic marker -'e'
Date: Sun, 22 Nov 2009 12:00:04 -0500
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=yCh0tw05Zc+/6BO2gQQZYy+UjEYWBabCwWfqEZpNRsk=; b=jQe1GZwmif53EIIXRmJEuX4pPJrhAYvYa3Lb+LStBqJf0jGg9au/9C52IrA8wXQHmw r6sodBW45tT97Nkv5ZKA5LfETWxFHQ2FWqlK1lV9yAmxNcNO5JUrqfF/KQAoGkmomOHF vjiauGqUpPhHba83FkGwY4qMlzsyGE2AraLGY=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=GUta1LXlUWjvSek7sV3USVojnW0Ep//U2PSO2BLfAlKk0clPsjLqg1latZZYL/YhZg qLJv2ZtSc0DZLHQo4KvVNvzgBuTW++/CLKUDMt8DWlFLk6LGrDJ7zm1WqaNyl9zIPh2L uvIF1UQ5n6EMfqlRJ+IzcSTq7b7odVV+1hgME=
In-reply-to: <[email protected]>
References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>

2009/11/22 Steven Lytle <[email protected]>:
> It seems that your (or any) MT program should at least attempt to translate
> even ungrammatical utterances.

I actually do take a pass at them after marking them as ungrammatical.

It's still important to distinguish the two - first, because you can
be much more confident about the intended overall meaning of the
grammatical ones, and second, because the grammatical ones are a lot
less unambiguous - you don't have to consider the possibility that a
noun ending in -vaD or -Daq could be the subject.

On the current build, if you take a sentence like

mapum Sor

which I think we can all agree is awful, you get

* fall tree

The * marks it as ungrammatical, but the program makes a try at the
individual words without trying to establish any relationship between
them.

In contrast

ngemDaq pum Sor

returns

The tree falls in the forest

with re-ordering, insertion of appropriate articles and prepositions,
etc.  (Plus a gentle reminder on a different line that there are other
legitimate parses because "ngem" and "Sor" could be plural.)

While it might well be worth doing more re-ordering of the
ungrammatical sentences, it's a lower priority than trying to ensure
that if a sentence *is* grammatical, the program can handle it.

Follow-Ups:
- Re: The topic marker -'e'
  - From: Steven Lytle <[email protected]>

References:
- The topic marker -'e'
  - From: Tracy Canfield <[email protected]>
- Re: The topic marker -'e'
  - From: Terrence Donnelly <[email protected]>
- Re: The topic marker -'e'
  - From: Tracy Canfield <[email protected]>
- Re: The topic marker -'e'
  - From: Steven Lytle <[email protected]>

Prev by Date: Re: The topic marker -'e'
Next by Date: Re: The topic marker -'e'
Previous by thread: Re: The topic marker -'e'
Next by thread: Re: The topic marker -'e'
Index(es):