tlhIngan-Hol Archive: Fri Jan 04 21:33:41 2013

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble

Robyn Stewart ([email protected]) [KLI Member] [Hol po'wI']



I followed Felix's advice deleted all occurrences of the most common character names, then deleted all X, Z and F and then redid the search. It's still a little skewed towards the letters that are the same in xifan as tlhIngan--I didn't strip out the pIqaD titles, so for example m is M in xifan but y is y, meaning that yay will be counted where it occurs in an 'ay' per but may won't be.

qaghwI' is still the second most frequent letter, so a good Scrabble set should have as many qaghwI'mey as 'atmey, or at least as many as 'etmey.

Freq.   Letter
72543
39022   a
38237   '
21465   e
20702   o
19952   I
19760   H
18060   u
15987   j
13557   D
12305   m
11565   l
9979    v
9892    q
9647 ch - this was the frequency for c, with h at 9633, the discrepancy explained by a few non-Klingon words with c or h
9530    b
9359    S
9271    Z --> gh
9145    p
8695    t
8105    n
7648    y
7422    X --> tlh
6611    w
5793    Q
4978    r
3744    F --> ng
-------
Here are the other symbols and non-Klingon letters. (I moved the mevwI' and yevwI' down to here). They are uninteresting, but just give you an idea of what other junk is in the file. I honestly think most of it is from xifan (the way I have to write in order to use my pIqaD font). There is a more interesting list following this one.

14365   .
5067    ,
1122    ?
691     !
171     -
127     O
126     @
121     A
94      U
80      E
69      &
66      M
63      L
62      T
62      :
59      G
57      #
57      $
57      %
55      ^
54      N
51      (
51      )
47      s
47      *
42      J
41      P
40      C
39      K
39      B
38      g
35      k
30      Y
30      R
29      W
27      i
26      V
18      d
6       "
6       ;
5       1
5       2
4       5
3       9
3       0
3       3
3       /
3       6
2       8
2       7
1       4
1       [
1       ]
1
1       ©
3152    «
4       ­
3129    »

And now I have a theory why j is the third most common consonant after ' and H. There's a lot of dialogue in the story, so a lot of jatlh, but that didn't do it, because tlh is one of the least common letters. It's all the first person statements: jIlegh and jIjaH and jIyaj. Here are the top words used:

Unique words:22058  Total words:71912
Freq.   Word
2152    jatlh
1751    .
1736    'ej
1265    'e'
1118    ,
1089    'ach
513     je
429     HoD.  - three of my major characters are captains.
410     jatlh
394 tlhIngan - interesting. I really wasn't aware I used the word that much. I guess the aliens do.
341     HoD
295     'a
277     je.
273     Duj
265     ?
263     ghIq - pretty funny considering the word didn't exist a few years ago.
258     qImyal - a character name I didn't take out
254     DaH
210     chaq
204     yaS
201     wa'
185     latlh
185     Hoch
183     law'
180     vay'
177     neH
169     cha'
158     qaStaHvIS
156     ghaH
150     jatlh.
147     HoD,
147     loQ
143     wej
139     Hol
137     pa'
135     SIbI'
135     lojmIt
133     pagh
133     ghaytan
132     De'
127     neH.
126     Hung
122 vaD - from vajarvaD and 'eSSImvaD, but I deleted the character names stranding the suffix.
115     vaj
110     meH
105     qama'
103     QumwI'
101     net - interesting!  Never thought I used that that much.
100     be'
95      'avwI'
94      Sa'
94      nov
93      pay'
93      wo'
93 quv - although the character counting is case sensitive, the word-counting isn't, so this is Quv and quv 93 tIr - ha ha, if you've read the story you know why this one, love ya QeS.
93      ghop
91      mIch
90      qatlh
89      Qel/qel
87      not
85      'ungya - oops, another character name
85      qImyal.
83 ravDaq - okay that's just funny. A lot of things take place on the floor in this novel.
81      Qa'bar - another character, didn't think he was that important
80      nom
80      loD
80      bIQ - that does make sense
79      beq
78      HoS
78      chu'
78      'oH
78      ghu'
77      -
77      wa'DIch
77      mu'mey
75      tlhoS
75      nuq/nuQ
75      ghaH.
74      reH
74      naDev
72      Qa'bar.
70      Qu'/qu'
70      nuH
69      'oy' - ha! Did I really hurt people that much?
69      Sa'.
68      Dugh - the name of one of the ships.
68 HIq - pretty hard drinking Klingons, considering that the aliens who take up half the story hardly touch the stuff
67      vay'
67      loS
67      muD
67      bom
67      qab/Qab - yep I shoehorned Qab in somewhere
67      nISwI' - lots of those
66      retlhDaq - wow, describing locations = more common than I thought
66      DeghwI'
66      DIr
66      ...
64      jang
64      jagh
63      Hagh
63      taj
62      QeDpIn.
62      qImyal,
62      DeS
62      lo'
61      SaH
61      «
60      meHDaq
60      vagh - these numbers might be from chapter headings
60      tugh
59      veS - another ship name
58      chay'
58      'op
58      puS
57      Qel.
57      'Iw  - heh, it turned up later than Hagh
56      motlh
56      DIvI'
55      «DaH
54      ghogh
54      nach
54      may'
54      DaHjaj
54      pe'vIl
54      Sov
53      be'tlhar - another name
53      ben
53      Soj
52      batlh
52      pIj
52      Daq/DaQ
52      potlh
52      jonwI'
52      'oH.
52      Hegh  - this is such a Klingon list
51      nID
51      nuv
51      QeDpIn
51      ghaHvaD
50      naQ
50       HungpIn

Most of the words are adverbs or conjunctions because they are the only words that don't take affixes. Nouns and verbs so often have affixes on them that they don't have high frequency in any one configuration. Here's a chunk of lower frequency, all the same verb, and of course not including the versions starting with ma- vI- mu- orcetera.

5         FAZ
1       FAZBE'.
1       FAZBE'PU'
1       FAZCHUZ
1       FAZDI'
1       FAZLAH
2       FAZMEH
1       FAZPU'WI'PU'
1       FAZPU'?
1       FAZQA'LI'.
1       FAZQAF.
1       FAZQAFBE'WI',
1       FAZQO'.
1       FAZRUP
1       FAZRUPDI'
1       FAZRUPQU'CHOH
1       FAZRUPWI'PU'.
1       FAZTA'BOZ
1       FAZTA'MO'
1       FAZTAH
1       FAZTAHVIS

If someone wants more data I can work more on this.

- Qov

At 17:36 '?????' 1/4/2013, you wrote:
Kind of a rookie solution, but what you could do to check the frequencies of ng/gh/tlh is to make a cooy of your document and then do a Find & Replace thusly:

tlh -> X
gh -> Z
ng -> F

?or some other letters that aren't much used already (do a search first so you can subtract pre-existing ones from the total. It's important to do gh before ng, because otherwise ngh will become Fh, rather than nZ.

If you'd like, you can also do something similar with alien words/names like Mahoun that you don't want altering the results.
________________________________________
From: Robyn Stewart [[email protected]]
Sent: Saturday, January 05, 2013 02:18
To: [email protected]
Subject: Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble

I can, but I don't have a platform in which I can write a clever
script, so this counts each character for itself not as part of its
Klingon letter. Here's what I get, with my comments.

72543       - That's the space character, what you'd expect for a 75k
word novel.
44671   a  - Our existing distribution gets that right. I wonder if
this is biased by character names. The main character is named vajar.
I'll do a version stripped of character and ship names once I have a
better system.
40793   '    - I told you there weren't enough qaghwI'mey in the
game. It beats out all but one vowel!
28488   h   - This combines the letter's presence in tlh and gh, but
excludes H.
23699   o
22652   e
21140   I
20469   H - I expected this to be more common in text than in the
dictionary, because it's in -taH and -Ha' and -laH and -moH and -meH ...
20213   u  - last of the vowels
19024   l   - biased because this includes l and tlh
17380   t   - biased by t + tlh
17291   j - interesting. One of the ship names has a j and so does
the main character's name. That might be a factor. But the main
character also has a v and and r, so I don't think so.
14365   .  - Heh. Short sentences, eh?
13634   g - A combination of gh + ng
13627   m
13557   D
13455   n - includes n and ng
11737   S
11226   v
9892    q
9647    c
9530    b
9145    p
7685    y
7653    r
6611    w
5793    Q

So it looks like yay ray way and Qay should be the high-scoring
letters.  Whoda thunk there were over three times as many Haymey as Qaymey.

As an indication of the cleanliness of the data, here's the rest.

5067    ,
1702    M - two alien characters, one of whom is a main character,
have names starting in M. The names of alien ships and persons is
also the explanation for most of the non-Klingon alphabetic characters below.
1122    ?
691     !
374     s
172     T
171     -
141     i
132     O
126     A
126     @ - The pIqaD 'ay' titles are typed in xifan hol, which
renders the numbers as cartoon swear words.
120     x
94      U
81      E
76      F
70      R
69      &
63      L
62      :
59      G
57      #
57      $
57      %
55      ^
54      N
51      )
51      (
47      *
43      J
41      P
40      C
39      K
39      B
35      k
30      Y
29      W
27      V
24      X
18      d
16      f

At 22:52 '?????' 1/2/2013, you wrote:
>Robyn,
>     Could you analyze your own writings? I bet that would give a good
>letter frequency representation.
>
>Tim Stoffel
>
>--
>
>On Tue, 2013-01-01 at 14:09 -0800, Robyn Stewart wrote:
> > That's an interesting question. Is the letter frequency distribution
> > of a large piece of text different than the frequency distribution in
> > a complete wordlist of that language?  I think a list compiled just
> > from TKD affix and vocabulary lists might competitively
> > under-represent qaghwI', as it's in so many affixes.
> >
> > I found a shortage of qaghwI'mey during game play, but the
> > artificiality of the arbitrarily high scores for tlh and ng didn't
> > bother me much. It was just a luck thing.
> >
> > - Qov
> >
> > At 13:21 '?????' 1/1/2013, Felix Malmenbeck wrote:
> > > At the risk of showcasing my ignorance with regards to Scrabble:
> > >
> > > Does one actually need a corpus to decide character values for
> > > Scrabble? I imagine that a lexicon along with the rules for
> > > appending affixes would suffice, as the deciding factor is what
> > > words can be formed, rather than what words are most commonly used
> > > (or do rare/difficult words weigh more heavily in that
> > > calculation?).
> > >
> > >
> > > ____________________________________________________________________
> > > From: David Holt [[email protected]]
> > > Sent: Tuesday, January 01, 2013 22:14
> > > To: tlhIngan Hol mailing list
> > > Subject: Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble
> > >
> > > > On Mon, Apr 14, 2008 at 11:57 PM, Alan Anderson
> > > <[email protected]> wrote:
> > > > > I got it from DloraH, who got it from janSIy, who I believe
> > > originated it.
> > >
> > > I didn't originate it, but I may have been the first one to bring a
> > > converted set to the qep'a'.  I got the frequencies and values off
> > > this very list and I no longer remember who did the calculations or
> > > came up with the values.  It was probably 15 years ago.  The game is
> > > fun, but the scores are somewhat artificial since the point values
> > > were based on rarity of English letters and so it's weird to have
> > > common letters like <tlh> be worth so many points.  I think any new
> > > calculations should be based on Qov's <nuq bop bom>, since that is a
> > > large piece of original tlhIngan Hol writing.
> > >
> > > janSIy
> > > _______________________________________________
> > > Tlhingan-hol mailing list
> > > [email protected]
> > > http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
> > _______________________________________________
> > Tlhingan-hol mailing list
> > [email protected]
> > http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
>
>
>
>_______________________________________________
>Tlhingan-hol mailing list
>[email protected]
>http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol


_______________________________________________
Tlhingan-hol mailing list
[email protected]
http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol


_______________________________________________
Tlhingan-hol mailing list
[email protected]
http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol



Back to archive top level