tlhIngan-Hol Archive: Thu Oct 03 14:56:45 1996
[Date Prev][Date Next][Thread Prev][Thread Next]
Re: An interesting Scrabble idea
- From: [email protected]
- Subject: Re: An interesting Scrabble idea
- Date: Thu, 3 Oct 1996 17:56:13 -0400
[Re letter frequency]
Qob, I doubt that the letter distribution which we used at the qep'a' was
very good. Actually, I had mentioned that the subject had come up before on
the mailing list. I had saved two posts which counted letter distributions,
and they agree on most numbers. I'll repost them here. (Jeremy, note that
your Scrabble letters are NOT in these distributions!)
--------------------
from Matt Whiteacre:
--------------------
I have gone back to the FTP site and collected all the works in the
AESOP, KBTP, and KSRP directories, and run an letter count on them.
I have duplicated the results below. The left side is sorted vaguely
by alpha, while the right is by frequency.
a 27487 a 27487
b 6152 ' 25942
ch 5661 o 15185
D 8125 e 15129
e 15129 H 14497
ng 1776 I 13579
gh 7678 j 12043
H 14497 u 11487
I 13579 m 8986
j 12043 v 8826
q 6770 D 8125
l 7389 gh 7678
m 8986 l 7389
n 5424 S 7115
o 15185 q 6770
p 5045 t 6258
Q 3439 b 6152
r 3752 ch 5661
S 7115 n 5424
t 6258 p 5045
u 11487 w 4357
v 8826 y 4202
w 4357 r 3752
tlh 3268 Q 3439
y 4202 tlh 3268
' 25942 ng 1776
There were 239572 characters considered, out of a total file size of
360K. This includes Hamlet and Much Ado About Nothing. If you
compare this distribution to the one I previously used, you find that
5 pairs of letters switched: b/t, l/S, v/m, H/I, and e/o. The
English frequency was based on the boggle dice themselves, which
matches the distribution in english well. Thus my proposal for a
conversion still stands.
P.S. for those interested there were 330 occurences of the
combination "rgh" which are included in the table above under both
"r" and "gh".
--------------------
--------------------
>From Daniel Noll (voqHa'wI'):
--------------------
I just searched through Hamlet myself, and got this distribution from that.
I scaled the most frequent, {a}, to 100 to allow a fair comparison.
I would not at all be suprised if they check out with the other
distributions.
a 100, ' 97, e 57, o 57, H 55, I 51, j 45, u 43, m 37, v 31, D 30, l 29,
q 26, gh 26, S 25, t 23, b 23, ch 20, n 20, p 19, w 17, y 16, r 15, Q 14,
tlh 11, ng 7.
--------------------
SuStel
Stardate 96758.1