tlhIngan-Hol Archive: Wed May 29 17:01:48 2002
[Date Prev][Date Next][Thread Prev][Thread Next]
Re: lexicographical issue...
- From: willm@cstone.net
- Subject: Re: lexicographical issue...
- Date: Wed, 29 May 2002 21:01:34 GMT
> Hi people,
>
>
> thanks for the reactions... Hopefully I get even more...
reH Dajangqanglu'taH.
..
> To investigate what was lacking in the TKD, and what could,
> theoretically, be compiled out of all the (scattered) canon material,
> was indeed the focus of my paper, even though I'm mainly
> focusing on problems concerning lay-out, typological issues,
> whether to use nesting or not, whether to give examples, and
> idiomatic use, and proverbs in an entry, and things of the like. I
> will have to give examples in my paper, though, and I think I will
> drop some questions on the list every once in a while...
I will be exited to see your work progress.
> I think that in order to take the making or revising of a dictionary
> seriously, one should also try to take the rules that have been
> given seriously. There is always room for exceptions in a
> dictionary of course, since it will fit into the dictionary anyway.
> I'm convinced that making a revised dictionary too voluminous is
> not really a problem for Klingon, since (just being realistic)
> tlhIngan Hol will never acquire as many words as the English
> language has.
While this is true, you may still find yourself spending a good bit of time on
this. By my arbitrary criteria, I count 2,442 entries, Klingon-English only.
Removing what I arbitrarily choose to consider derivatives, I count 2,051
entries. Compared to a natural language, this is a small vocabulary, but it
still takes time to handle this volume well. It's not like a Tolkein language...
> As I had said, my problem concerns nesting here. If I would
> choose for nesting in a Revised Klingon Dictionary (or let's call it
> OKD for fun, to make the parallel with OED) (for those who
> don't know; nesting = to use more than one entry in the paragraph
> of a dictionary), then what kind of compound nouns should, or
> could, I put under one entry (that is, if I would want to use
> nesting). And what about derivatives? Of course, due to the
> extensive use, one should give <<SuvwI'>> as a separate entry,
> and not as a derivate entry under <<Suv>>, but should I give
> <<SuyDuj>> "merchant ship" a separate entry, or should I nest
> this compound noun under <<Suy>> "merchant". That is really the
> problem that is bothering me. The thing I'm looking for is a
> consistent system that would be suitable for the average
> dictionary user...
For my own system, I put a separate entry for each part of speech for each
word, so {SuvwI'} gets an entry less because of its common use than because it
is a noun, while {Suv} is a verb.
Part of this is because I have fit my dictionary into a database format. One of
the fields is "Part of Speech", so I can't have more than one value per entry.
Still, if you choose a format that can handle multiple parts of speech per
entry, I would not see a problem with nesting {SuvwI'} within the entry for
{Suv}, particularly since both words begin with the same letters, so someone
looking up {SuvwI'} would not find it difficult to find. Meanwhile, nesting
{be'} or {puqbe'} into one or the other would not work as well. Likely,
{puqbe'} should be redundantly nested within both {puq} and {be'}.
Your choice is arbitrary, though consistency is more important than which
choice you make. The choice you make will determine the personality of your
dictionary. The consistency you maintain will determine the quality of your
dictionary. The trick is to balance well the priorities of serving each word
well, vs. serving consistency well.
To give you an idea how others do this, my main dictionary is a database having
the following fields:
Klingon word: [Okrand's spelling]
Definition: [All of Okrand's glosses. Multiple sourced glosses are separated by
slashes matching the sequence of sources in the source field below.]
Part of speech: [adverb, conjunction, exclammation, noun, number, pronoun,
proper name, question or verb]
Grammar note: [adjectival, body part, intrans, plural form, trans, trans &
intrans, uses language]
Source: [Addendum, BoP Poster, CK, HolQeD, KCD, KGT, MSN, other, TKD or TKW]
Reviewed: [Date last checked by me.]
canon: [Examples - not a well populated field]
related words: [Klingon synonyms/antonyms]
pun: [yes/no]
needs attention: [yes/no - I maintain other dictionaries, so this tells me a
word is entered in my main dictionary, but not others yet, or otherwise needs
some sort of confirmation or attention]
derivative: [yes/no - I used this with a filter for the count I supplied above]
New Words List [yes/no - Is this word in the KLI's online New Words List, or
does it need to be? Since I maintain that list, it's good to have something
quick to filter on to check the list.]
Comments: [Mostly explanations of puns, what sort of attention is needed, or
other explanation relating to the yes/no fields above. Any misc. comment.]
Modified: [Date last changed]
Created: [Date first entered]
This approach has made this dictionary a good hub for me in working with the
word list. The advantage of a database is that it encourages a level of
consistency that a simple document does not enforce, and the structure can be
modified at any time to suit future uses. It's current form is quite different
than it was when I first created it.
Originally, it was an MS Word document with both piqaD and romanized entries,
both English-Klingon and Klingon-English. Okrand has one of those early
versions. Later, it became a series of memo files in my Palm Pilot, to
facilitate lookup. I always have my Visor with me, so I can find things there.
This eliminated the need for an English-Klingon side, since lookup uses search
tools.
Later, I moved it into JFile, a flat-file database on the Palm OS platform.
I also created an Access database on my PC, though it is less current because
it is less accessible (ironically enough).
Similarly, the KLI's online New Words List was originally an HTML document,
hand-edited, mostly by Mark Shoulson and myself. I did the content and he did
much of the tagging, which I then mimicked as I updated the entries. More
recently, he set up an online database. The external appearance is nearly
identical to the original, only now instead of editing a document, I edit
fields in a database so you can view the same list either alphabetically or
reverse-cronological order by entry (for people who have been keeping their own
word lists and need to check to see if anything has been added since they last
maintained their word lists.
Others have used Excel as a flat-file database or have used dedicated
dictionary programs (like ghunchu'wI''s data entry work on a blazingly fast
polylingual dictionary for Palm OS) or they've written their own, like Holtej's
pojwI'. I still prefer his original version to the later one for quick lookup...
Still, whenever I really need to check on a word, I go back to my hub database.
Most importantly, it cites the source for each entry. Sometimes, I have to make
up definitions because some of the sources are of words Okrand loosely
describes in the middle of a paragraph of English text, so we lack a definition
that neatly fits Okrand's "gloss" style of definition. Citing the source allows
me to go back and check to see if I misinterpreted something.
This was recently useful when someone pointed out that a {qa'rol} is probably
yellow and not black. I had assumed it was black, since it was described as
being a bird larger than a {notqa'}, which was described as a large, black
bird. Looking back at the source, I could tell that I had mistakenly assumed
that both birds were black. I then went back to change the definition, since I
had to make up the gloss, myself, either way.
As many here have explained before, no single other exercise will bring you as
much understanding of the Klingon language than making your own dictionary. So
long as you just look things up in lists others have compiled, you won't know
what words exist as your mind sifts through synonyms to find the word you want.
You would not necessarily know, for example, that there is no word for "ball",
though there is one for "sphere". That makes a big difference, especially with
a vocabulary that has such arbitrary omissions and details as to
include "writer's cramp" and "paper clip", but omit "end" or "tip". There's a
verb for "be drunk", but nothing for "slurr" or "hic-up" or "burp" or "cough".
We work around these omissions, but it always takes effort.
> qeyS.
Will