tlhIngan-Hol Archive: Wed May 29 17:01:48 2002

Back to archive top level

To this year's listing

[Date Prev][Date Next][Thread Prev][Thread Next]

Re: lexicographical issue...

> Hi people,
> thanks for the reactions... Hopefully I get even more... 

reH Dajangqanglu'taH.

> To investigate what was lacking in the TKD, and what could, 
> theoretically, be compiled out of all the (scattered) canon material, 
> was indeed the focus of my paper, even though I'm mainly 
> focusing on problems concerning lay-out, typological issues, 
> whether to use nesting or not, whether to give examples, and 
> idiomatic use, and proverbs in an entry, and things of the like. I 
> will have to give examples in my paper, though, and I think I will 
> drop some questions on the list every once in a while...

I will be exited to see your work progress.

> I think that in order to take the making or revising of a dictionary 
> seriously, one should also try to take the rules that have been 
> given seriously. There is always room for exceptions in a 
> dictionary of course, since it will fit into the dictionary anyway. 
> I'm convinced that making a revised dictionary too voluminous is 
> not really a problem for Klingon, since (just being realistic) 
> tlhIngan Hol will never acquire as many words as the English 
> language has.    
While this is true, you may still find yourself spending a good bit of time on 
this. By my arbitrary criteria, I count 2,442 entries, Klingon-English only. 
Removing what I arbitrarily choose to consider derivatives, I count 2,051 
entries. Compared to a natural language, this is a small vocabulary, but it 
still takes time to handle this volume well. It's not like a Tolkein language...

> As I had said, my problem concerns nesting here. If I would 
> choose for nesting in a Revised Klingon Dictionary (or let's call it 
> OKD for fun, to make the parallel with OED) (for those who 
> don't know; nesting = to use more than one entry in the paragraph 
> of a dictionary), then what kind of compound nouns should, or 
> could, I put under one entry (that is, if I would want to use 
> nesting). And what about derivatives? Of course, due to the 
> extensive use, one should give <<SuvwI'>> as a separate entry, 
> and not as a derivate entry under <<Suv>>, but should I give 
> <<SuyDuj>> "merchant ship" a separate entry, or should I nest 
> this compound noun under <<Suy>> "merchant". That is really the 
> problem that is bothering me. The thing I'm looking for is a 
> consistent system that would be suitable for the average 
> dictionary user...
For my own system, I put a separate entry for each part of speech for each 
word, so {SuvwI'} gets an entry less because of its common use than because it 
is a noun, while {Suv} is a verb.

Part of this is because I have fit my dictionary into a database format. One of 
the fields is "Part of Speech", so I can't have more than one value per entry.

Still, if you choose a format that can handle multiple parts of speech per 
entry, I would not see a problem with nesting {SuvwI'} within the entry for 
{Suv}, particularly since both words begin with the same letters, so someone 
looking up {SuvwI'} would not find it difficult to find. Meanwhile, nesting 
{be'} or {puqbe'} into one or the other would not work as well. Likely, 
{puqbe'} should be redundantly nested within both {puq} and {be'}.

Your choice is arbitrary, though consistency is more important than which 
choice you make. The choice you make will determine the personality of your 
dictionary. The consistency you maintain will determine the quality of your 
dictionary. The trick is to balance well the priorities of serving each word 
well, vs. serving consistency well.

To give you an idea how others do this, my main dictionary is a database having 
the following fields:

Klingon word: [Okrand's spelling]
Definition: [All of Okrand's glosses. Multiple sourced glosses are separated by 
slashes matching the sequence of sources in the source field below.]
Part of speech: [adverb, conjunction, exclammation, noun, number, pronoun, 
proper name, question or verb]
Grammar note: [adjectival, body part, intrans, plural form, trans, trans & 
intrans, uses language]
Source: [Addendum, BoP Poster, CK, HolQeD, KCD, KGT, MSN, other, TKD or TKW]
Reviewed: [Date last checked by me.]
canon: [Examples - not a well populated field]
related words: [Klingon synonyms/antonyms]
pun: [yes/no]
needs attention: [yes/no - I maintain other dictionaries, so this tells me a 
word is entered in my main dictionary, but not others yet, or otherwise needs 
some sort of confirmation or attention]
derivative: [yes/no - I used this with a filter for the count I supplied above]
New Words List [yes/no - Is this word in the KLI's online New Words List, or 
does it need to be? Since I maintain that list, it's good to have something 
quick to filter on to check the list.]
Comments: [Mostly explanations of puns, what sort of attention is needed, or 
other explanation relating to the yes/no fields above. Any misc. comment.]
Modified: [Date last changed]
Created: [Date first entered]

This approach has made this dictionary a good hub for me in working with the 
word list. The advantage of a database is that it encourages a level of 
consistency that a simple document does not enforce, and the structure can be 
modified at any time to suit future uses. It's current form is quite different 
than it was when I first created it.

Originally, it was an MS Word document with both piqaD and romanized entries, 
both English-Klingon and Klingon-English. Okrand has one of those early 
versions. Later, it became a series of memo files in my Palm Pilot, to 
facilitate lookup. I always have my Visor with me, so I can find things there. 
This eliminated the need for an English-Klingon side, since lookup uses search 

Later, I moved it into JFile, a flat-file database on the Palm OS platform.

I also created an Access database on my PC, though it is less current because 
it is less accessible (ironically enough).

Similarly, the KLI's online New Words List was originally an HTML document, 
hand-edited, mostly by Mark Shoulson and myself. I did the content and he did 
much of the tagging, which I then mimicked as I updated the entries. More 
recently, he set up an online database. The external appearance is nearly 
identical to the original, only now instead of editing a document, I edit 
fields in a database so you can view the same list either alphabetically or 
reverse-cronological order by entry (for people who have been keeping their own 
word lists and need to check to see if anything has been added since they last 
maintained their word lists.

Others have used Excel as a flat-file database or have used dedicated 
dictionary programs (like ghunchu'wI''s data entry work on a blazingly fast 
polylingual dictionary for Palm OS) or they've written their own, like Holtej's 
pojwI'. I still prefer his original version to the later one for quick lookup...

Still, whenever I really need to check on a word, I go back to my hub database. 
Most importantly, it cites the source for each entry. Sometimes, I have to make 
up definitions because some of the sources are of words Okrand loosely 
describes in the middle of a paragraph of English text, so we lack a definition 
that neatly fits Okrand's "gloss" style of definition. Citing the source allows 
me to go back and check to see if I misinterpreted something.

This was recently useful when someone pointed out that a {qa'rol} is probably 
yellow and not black. I had assumed it was black, since it was described as 
being a bird larger than a {notqa'}, which was described as a large, black 
bird. Looking back at the source, I could tell that I had mistakenly assumed 
that both birds were black. I then went back to change the definition, since I 
had to make up the gloss, myself, either way.

As many here have explained before, no single other exercise will bring you as 
much understanding of the Klingon language than making your own dictionary. So 
long as you just look things up in lists others have compiled, you won't know 
what words exist as your mind sifts through synonyms to find the word you want. 
You would not necessarily know, for example, that there is no word for "ball", 
though there is one for "sphere". That makes a big difference, especially with 
a vocabulary that has such arbitrary omissions and details as to 
include "writer's cramp" and "paper clip", but omit "end" or "tip". There's a 
verb for "be drunk", but nothing for "slurr" or "hic-up" or "burp" or "cough".

We work around these omissions, but it always takes effort.

> qeyS.  


Back to archive top level