tlhIngan-Hol Archive: Sun Nov 22 09:01:47 2009

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

Re: The topic marker -'e'

Tracy Canfield (toastrix@gmail.com)



2009/11/22 Steven Lytle <lytlesw@gmail.com>:
> It seems that your (or any) MT program should at least attempt to translate
> even ungrammatical utterances.

I actually do take a pass at them after marking them as ungrammatical.

It's still important to distinguish the two - first, because you can
be much more confident about the intended overall meaning of the
grammatical ones, and second, because the grammatical ones are a lot
less unambiguous - you don't have to consider the possibility that a
noun ending in -vaD or -Daq could be the subject.

On the current build, if you take a sentence like

mapum Sor

which I think we can all agree is awful, you get

* fall tree

The * marks it as ungrammatical, but the program makes a try at the
individual words without trying to establish any relationship between
them.

In contrast

ngemDaq pum Sor

returns

The tree falls in the forest

with re-ordering, insertion of appropriate articles and prepositions,
etc.  (Plus a gentle reminder on a different line that there are other
legitimate parses because "ngem" and "Sor" could be plural.)

While it might well be worth doing more re-ordering of the
ungrammatical sentences, it's a lower priority than trying to ensure
that if a sentence *is* grammatical, the program can handle it.






Back to archive top level