tlhIngan-Hol Archive: Tue Aug 25 03:12:15 1998

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

Re: scrabble



On Sun, 23 Aug 1998 16:33:37 -0700 (PDT) Burt Clawson 
<[email protected]> wrote:

> ja' Qermaq :
> 
> > tuv'el, pach puqloD ghItlh:
> >
> > >Being a database programmer by trade, I did a little analysis:
> > >Ranking of the most common letters in piqad:
> > <pe'>
> >
> > Coincidentally enough, a recent search through the archives turned that up
> > from 4 years ago. Visit http://kli.org/tlhIngan-Hol/1994/Aug94/0007.html and
> > the subsequent posdtings for at least 2 rankings of letter frequency.
> > Interesting comparison. tu'vel, how large was your sample? Did you repeat
> > words? Or is this what was found in a text?
> 
> The word count in my dictionary is 2690.  

As tempting as it is to use the computer to count these things, 
I've decided that it is completely arbitrary. I'm even trying to 
reduce the number of entries by notating slightly updated 
meanings under the same header, avoiding an opportunity to list 
a repeat as if it were a new word. Example:

ram - (v) "be trivial, trifling, unimportant / insignificant"
[TKD / KGT]

Did you count the ram = "insignificant" entry as a separate 
entry? It becomes arbitrary when a word becomes different enough 
from its homonym to be listed as a separate word while the words 
in question are the same part of speech. I'm also creating a new 
field I'm calling "derivative" in order to get a better sense of 
how many root words there are. By this, I mean {ghojmoH} is not 
counted as a new word. It is a derivative of {ghoj}.

Still, while trying to decrease the count, I'm trying to collect 
the entire vocabulary as it has been presented to us.

The task of figuring frequencies for Scrabble is quite complex 
and just as arbitrary. You should, for example, count each 
prefix once per verb root, since any prefix will be valid to add 
to the beginning of a root... except for those obviously 
intransitive verbs and transitive prefixes...

And then, most verb suffixes can go on any verb (except that 
some of them are senseless combinations, like {-chuq} and 
{-'egh} on intransitive verbs, except when combined with {-moH}) 
and you can combine the different suffixes in interesting, but 
not formulaic ways. In other words, any time you try to work out 
a formula, you come up with a lot of combinations that are 
gibberish. The only way to make it truely accurate is to go 
through and test every combination.

I don't think any of us are up for THAT little project. Anyway, 
whatever is done here is arbitrary. The thing is to make 
arbitrary choices and see if the resulting game is interesting.

> That includes all canonical sources I
> can find (i.e. those on the KLI site), all the affixes, and a very few
> extrapolations, like /vebHa'/ "previous."  There are repeats where there are
> synonyms or where the same word is both a noun and a verb.  That's 1181 nouns,
> 969 verbs, 200 proper nouns, 46 idioms, 42 numbers, 40 verb prefixes, 39
> similes, 38 verb suffixes, 37 adverbs, 36 exclamations, 26 noun suffixes, 10
> pronouns, 9 numeric suffixes, 9 question words, and 8 conjunctions.

I'll keep this to compare with my new database on my Pilot. I 
only have a little over 1700 words stuffed into it (up to 
{rI'Se') so far).
 
> I would think that for scrabble, a list of possible single words is preferrable
> for analysis over text, because you are trying to form single words, not
> sentences.  Then again in Klingon, you would have to include all legal
> combinations of affixes.  Hmmm, maybe a selection of text would be better after
> all.  >:-)

Again, all of it is arbitrary. Use selections of text and you 
never get to any of the Qov words (the ones nobody else thinks 
of).
 
> Just for kicks, I came up with the 10 most common syllables from the same
> database:
> 1- wI', 2- Ha', 3- moH, 4- Hom, 5- 'a', 6- be', 7- cha', 8- DI, 9- Duj, 10-
> ngan.
 
Meanwhile, in a scrabble game, you can add {-mo'} to almost 
anything. All nouns (including proper names) and all verbs can 
take it, even after a pile of suffixes. You won't find that in 
any word list or text, but you'll find it in any scrabble game 
if a person has those three characters in their letters.
 
> - tuv'el

charghwI'




Back to archive top level