tlhIngan-Hol Archive: Tue Jun 23 16:09:26 2009

Back to archive top level

To this year's listing



[Date Prev][Date Next][Thread Prev][Thread Next]

Re: Klingon orthography (was: Okrand at qep'a')

Michael Everson (everson@evertype.com)



On 23 Jun 2009, at 21:51, ghunchu'wI' 'utlh wrote:

>> Can't we do something to improve it? I wonder if this could be  
>> raised with Marc Okrand. WIth all respect to him, his orthography  
>> has several rather serious shortcomings.
>
> I believe you're pulling together things from very different  
> categories, each with different underlying technical concepts, in  
> order to count "several".

I was not being disingenuous.

>> The first shortcoming is very serious indeed, in terms of data  
>> integrity. Since "q" and "Q" are used as separate letters of the  
>> alphabet, words cannot be distinguished in, for example, google  
>> searches.
>
> That's a shortcoming of Google, not of the underlying data.

I don't believe that this is correct. Case-pairing is a normative  
element of the Unicode Standard, and the Unicode Standard is the basis  
for character encoding on all platforms now and for the future.

> Case-sensitive searching isn't a mythical technology, nor is it even  
> rare.

And I can accomplish it in a number of applications. But not Google.

> If it's an important enough issue, we can ask Google to make it  
> easily available to its users.

The community of Klingonists will not succeed in getting this. Forgive  
me, but you haven't the clout. Long s (Å) and s used to be  
distinguishable in Google. No longer. (Search "congreÅs".)

> But would any of your proposed reforms be any more searchable using  
> Google? I can't type half the characters you're using -- heck, I  
> can't *see* some of them.

You should know perfectly well that typing them is an input issue, and  
seeing them is a font issue. From CHARACTER point of view, using ANY  
other character to distinguish Klingon Q from q would solve the issue  
of data storage and searching.

> The answer to this complaint: Don't rely on Google for this, because  
> Google doesn't do the right thing.

Google follows the rules, which implement, reasonably for their  
purposes that case-matching (and other FORMAL EQUIVALENCES like long- 
s) is more important for the gigaquads of data they access than making  
the distinction.

> By the way, for someone who apparently cares about capitalization,  
> your lowercase rendition of the name "google" is a bit jarring.

Hey, I'm nobody's enemy for wanting to talk about this. Actually I  
tend to use "google" as a verb. There are other typos in my first note  
of this morning, like "diagraph" for "digraph". Honestly I didn't  
proof it as much as I proofed the Klingon examples. Forgive me.

>> If a casing operation is accidentally applied to a run of Klingon  
>> text (say, upper-casing or lowercasing), the original text cannot  
>> be reconstructed. Okrand had other considerations when he designed  
>> Klingon orthography all those years ago, but now that we manage  
>> Klingon as data, a reform should be considered.
>
> If any nonreversible operation is accidentally applied to any data,  
> the original is lost. This is a consequence of the very nature of a
> nonreversible operation, not a shortcoming of the data itself.

Casing operations ARE reversible, if case-pairing equivalences are  
respected. If I take

nuqDaq 'oH Qe' QaQ'e'

and uppercase it, I get

NUQDAQ 'OH QE' QAQ'E'

and if I lowercase it I get

nuqdaq 'oh qe' qaq'e'

In neither case can I restore the distinction between q and Q. If on  
the other hand I take

nuqDaq 'oH Xe' XaX'e'

and uppercase it, I get

NUQDAQ 'OH XE' XAX'E'

and if I lowercase it I get

nuqdaq 'oh xe' xax'e'

and in both the uppercased and lowercased versions the essential  
letter identity is preserved.

>> The second shortcoming is practical. In many fonts, the letters "I"  
>> and "l" are nearly identical. This can impede reading.
>
> This is a shortcoming of the font used, not of the underlying data.  
> The same can be said about the difference between i and j in some  
> fonts, or g and q, or c and e, or even a and o. My own handwriting  
> makes "n" and "u" look similar, often to the point where each looks  
> more like the other than it does itself.

It is still a matter of legibility.

> The answer to this complaint: don't use a font that uses  
> indistinguishable symbols for different letters.

Why should Klingon text be restricted to fewer fonts than other  
languages? Take the word for "yes". Is there really anything wrong  
with writing "Hislah" or "hislah"? Is "HISlaH" really better? OK, I  
can see the difference between I/l fairly easily in the e-mail I am  
writing, because I'm using Everson Mono which distinguishes them  
nicely. But on page 170 of the second edition of the Klingon  
Dictionary the letters are nearly identical, the l being an eensy bit  
taller.

> Or just get used to it, because in practice, it isn't really a  
> problem in
> actual Klingon text. The CV(C) structure of syllables cuts out any  
> potential
> ambiguity. It only impedes reading if you can't read Klingon in the  
> first
> place -- or if you use a sans-serif font to view words that have too  
> many
> adjacent I's and l's in them. :)

It adds an unnecessary barrier for learners, and everyone is a learner.

> {tlhIllIj lI' lIlI'lI' jIllI' 'Il} "Your sincere neighbor is  
> transmitting
> all of you your useful mineral."

I really believe that

Tlhillij li' lili'li' jilli' 'il.

or

Åillij li' lili'li' jilli' 'il.

is superior.

>> The third shortcoming is aesthetic. Because it eschews casing in  
>> general, Klingon text cannot take advantage of ordinary typographic
>> conventions, which, in fairness, make any text easier to read.
>
> Feh. I can't agree that having variant forms for the same letter  
> does anything to ease the job of the reader.

Tosh. If you had learnt Klingon with a casing orthography you'd never  
question it, any more than you would question any other language using  
the Latin alphabet. (Lojban's screwed up in this regard as well.)

> I would argue the opposite: having the same shape for a word  
> wherever it appears makes the text easier to read.

that's why you and i write english without using capital letters,  
isn't it?

> If there is no difference between A and a, why require both of them  
> in the same typeface?

This question makes NO SENSE at all. There is a difference between A  
and a. It's part and parcel of a centuries-old

>
>
>> The Klingon alphabet is:
>>
>> a b ch D e gh H I j l m n ng o p q Q r S t tlh u v w y â
>
> Yes, that's the standard Romanized transcription of the sounds of  
> Klingon.

No need to be pedantic. I do know this. In fact I would not say that  
"the standard Romanized transcription" is quite accurate. I would say  
that "the standard Latin orthography" is more apt.

>> ===
>> In IPA this is
>>
>> [a b tÊ É É É x É dÊ l m n Å o pÊ q qÏ r Ê tÊ tÉ u v w j  
>> Ê]
>
> No, it's not. I can't type the symbols, but the second character  
> you've chosen in the {ch} sound should definitely not be the same as  
> the {S} sound.

English "chip" is [tÊÉp]. English "ship" is [ÊÉp]. Okrand does  
specify a retroflex [Ê]; my mistake.

> But if we render the sounds of Klingon using IPA, wouldn't that be  
> enough to address your needs?

IPA would be unambiguous, but it's not the most practical or  
attractive of orthographies.

QaÊtaxvÉÊ xotÊ ÉÉÊ, loÊloÉ XolqÏÉÉ tÊÉnmox TÉÉÅan Xol  
JÉjxaÉ; ÉÉÊmaj qÉÅwÉÊ potÉquÊ Êox. ÆÉtÉmÉj lÉÊ,  
ÉÉtÉmÉj motÉ jÉ ÅaÊ xotÊ jabbÉÊÉÉ, ÊÉj TÉÉÅan  
xolqÏÉÉ, TÉÉÅan xol, TÉÉÅan nuÉ jÉ qÉl. QÉtÊmÉjÊÉÊ  
ÅaÊboÉ nuÅboÉ jabbÉÊÉÉ nuÉmÉx ÊÉj ÉoxmÉx narÉ jÉ  
laÉwÉÊpuÊ jabbÉÊÉÉxommÉj; mavuvtÊuqmÉx ÊÉj  
majaÊtÊuqtÊuÊmÉx narÉ. XolqÏÉÉ nÉv lawÊ, qÏonoÊ motÉ nÉv  
puÊ: xaÉtÊuÊmÉx qÏonoÊ Êox xolqÏÉÉÊÉÊ. ÊOxÉaq narÉpaÊ  
ÉÉtÉ, Êox nuÉtÊuÊ latÉ, ÊÉj ÉÉtÉ ÉÉtÉwÉÊ ÊovbÉÊ.  
XolqÏÉÉ jÉx ÂJÉjquv PaqÉomÂ, ÊÉj Êox boÊ jÉ ÂÆax Xol  
JÉjxaÉÂ.

>> Replacing H q Q with x k q is a handy idea, if diacritics are to be  
>> shunned, though this will change wordforms quite a lot for anyone  
>> used to reading Klingon already.
>
> Wouldn't *any* spelling reform change the wordforms?

Some tread more lightly than others. Please look at the different  
examples I sent. You will surely find some to be "easier" or "more  
attractive" than others.

> If you're still worried about Google searches, you might as well  
> stick with an already-existing convention: keep the H, use X for the  
> {tlh} sound, G for {gh}, C for {ch}, and F for {ng}. The other  
> problem character is the apostrophe, which gets replaced with the  
> last unused letter of the English alphabet: Z.
>
> KASTAHVIS HOC DIS, LOSLOG HOLQED CENMOH XIFAN HOL YEJHAD; DEZMAJ  
> KEFWIZ POXKUZ ZOH...

That's a bit beastly. I take it that's a pre-PUA glyph assignment for  
the Piqad [sic]?

> This has the advantage of being expressable using ASCII, Morse,  
> semaphore, Braille, TTY/TDD, typewritten notes, handwritten notes,  
> and basically any communication medium devised that can handle  
> English.

Is the rÃle my belovÃd faÃade plays so naÃve?

Unicode handles all of the world's writing systems. Standard Latin  
orthography for Klingon is broken vis à vis Q/q, and in my view is  
also disadvantaged in terms of ordinary typography because of its  
unusual use of uppercase letters throughout.

> I want to know what your real goal is, though.

An improved orthography for Klingon.

> Are you concerned with the representation of the characters in data,

Yes. The equivalence of Q/q is a permanent problem for Klingon data.

> with the ability to manipulate the data,

Sure.

> or with the visible appearance of the glyphs?

With the visible and aesthetic appearance of Klingon text, rather.  
Klingon is 24 years old. It is not too late for it to grow up and take  
advantage of modern typography. If we wait another 24 years it will  
certainly be too late, or at least more painful in terms of existing  
data.

> I know you understand that they are completely separate issues, so  
> mixing them in your enumeration of shortcomings seems confusing.

I'm not confused, though. Fixing the Q [qÏ] ~ q [q] problem implies  
the ability to case Q/q [q] normally and therefore allows the other  
letters to case normally as well. That leads to a few other questions.

Hopefully, ÇunÄu'wi' or Guncu'wi' or Ghunchu'wi', this would work in  
Klingon's favour. I do think that solving the Q/q problem would  
ultimately help the case for encoding the Piqad.

Michael Everson * http://www.evertype.com/







Back to archive top level