tlhIngan-Hol Archive: Fri May 31 06:13:45 1996

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

Re: Boggle



> What frequency count was used here? For instance, the first six in the
> English list should be "e, t, a, o, n, i, r, s, h, ...". Similarly, we need
> some "average" tlhIngan ghItlh to work from for this. Maybe counting the
> frequencies in Hamlet would produce a good distribution.

I have gone back to the FTP site and collected all the works in the 
AESOP, KBTP, and KSRP directories, and run an letter count on them.  
I have duplicated the results below.  The left side is sorted vaguely 
by alpha, while the right is by frequency.  

a	27487		a	27487
b	6152		'	25942
ch	5661		o	15185
D	8125		e	15129
e	15129		H	14497
ng	1776		I	13579
gh	7678		j	12043
H	14497		u	11487
I	13579		m	8986
j	12043		v	8826
q	6770		D	8125
l	7389		gh	7678
m	8986		l	7389
n	5424		S	7115
o	15185		q	6770
p	5045		t	6258
Q	3439		b	6152
r	3752		ch	5661
S	7115		n	5424
t	6258		p	5045
u	11487		w	4357
v	8826		y	4202
w	4357		r	3752
tlh	3268		Q	3439
y	4202		tlh	3268
'	25942		ng	1776

There were 239572 characters considered, out of a total file size of 
360K.  This includes Hamlet and Much Ado About Nothing.  If you 
compare this distribution to the one I previously used, you find that 
5 pairs of letters switched: b/t, l/S, v/m, H/I, and e/o.  The 
English frequency was based on the boggle dice themselves, which 
matches the distribution in english well.  Thus my proposal for a 
conversion still stands.

P.S. for those interested there were 330 occurences of the 
combination "rgh" which are included in the table above under both 
"r" and "gh".
Matt Whiteacre
[email protected]


Back to archive top level