tlhIngan-Hol Archive: Sun Nov 22 09:12:37 2009
[Date Prev][Date Next][Thread Prev][Thread Next]
Re: The topic marker -'e'
Steven Lytle (firstname.lastname@example.org)
On Sun, Nov 22, 2009 at 12:00 PM, Tracy Canfield <email@example.com> wrote:
> 2009/11/22 Steven Lytle <firstname.lastname@example.org>:
> > It seems that your (or any) MT program should at least attempt to
> > even ungrammatical utterances.
> I actually do take a pass at them after marking them as ungrammatical.
> It's still important to distinguish the two - first, because you can
> be much more confident about the intended overall meaning of the
> grammatical ones, and second, because the grammatical ones are a lot
> less unambiguous - you don't have to consider the possibility that a
> noun ending in -vaD or -Daq could be the subject.
> On the current build, if you take a sentence like
> mapum Sor
> which I think we can all agree is awful, you get
> * fall tree
> The * marks it as ungrammatical, but the program makes a try at the
> individual words without trying to establish any relationship between
> In contrast
> ngemDaq pum Sor
> The tree falls in the forest
> with re-ordering, insertion of appropriate articles and prepositions,
> etc. (Plus a gentle reminder on a different line that there are other
> legitimate parses because "ngem" and "Sor" could be plural.)
> While it might well be worth doing more re-ordering of the
> ungrammatical sentences, it's a lower priority than trying to ensure
> that if a sentence *is* grammatical, the program can handle it.
"mapum" doesn't mean 'fall'. It means "we fall" (or "we accuse"; "pum" is
two different verbs). There is no point in losing information that is given
in the original just because the translation is odd.
In fact, "mapum Sor" could be interpreted as "We trees fall", although this
use of a noun as subject with a non-third-person prefix is controversial at