Phonetics or morphology: sound or meaning?
The ideal which is frequently held up for spelling systems is one sound, one symbol. However, there are some problems with that.
The same word can be pronounced different ways in different contexts. Do we really want our writing system to reflect a variation that most of us are not even aware of?
Take, for example, the plural. Not anything complicated or irregular - just the normal "add s" rule. Say out loud "dogs and cats". Do they both end in the same sound? You should have heard <dogs> ending in [z], and <cats> ending in [s]. If you're not sure, lengthen the hiss, and touch your throat lightly as you say it - you should feel your vocal folds buzz for <dogs> but not for <cats>.
Now, whether or not the plural sounds like [s] or [z] is entirely predictable. (If the sound before it is voiceless - like [t] or [p] or [f], with no buzz - it will come out as [s]. Otherwise it will come out as [z].)
So the English plural morpheme, the part-that-means plural in English, is /z/. You form plurals by adding /z/, and then devoice it to [s] if context demands it.
You probably didn't know that. I didn't before I studied linguistics - I'd never thought about it. And that's the point - a lot of our linguistic knowledge is entirely subconscious. Our writing system can reflect 'bit' or 'pit', because that makes a difference to the meaning. But you are simply never allowed to say [dogs] or [catz] in English: it has to be [dogz] and [cats]. So there's no point trying to write down the difference: as far as most people know, we form all plurals by adding /z/. Well, by adding <s>...
Which dialect?
Let's assume we've settled the question of phonology vs phonetics, and everyone is agreed. All children from now on can learn to read and write a 'phonetic' system which reflects how they speak. No more confusing silent letters, no more single letters standing for multiple sounds, no more sounds with multiple spellings. Because everyone pronounces English the same. Wait, they don't? Oh, well, in that case, we'll just use... Hmm. Whose dialect? Yours?
Let's give a couple of examples to illustrate this problem.
What about <r>? Rhotic or non-rhotic
In words like <cart> or <port> or <hear>, certain British dialects - including Standard Southern British English - have lost their <r>. These are called 'non-rhotic' dialects (rho being the name of the Greek letter r). Most other dialects of English - Scottish, Irish, American, Canadian... - have kept the <r> that is there in the spelling.
This means that <caught> and <court> are homophones for SSBE speakers, they sound the same. So clearly, any spelling reform should spell them both the same: perhaps as <co:t>. (The colon : is the IPA symbol for a long vowel.)
But that means that any other speaker of English has 2 words that sound different and are spelt the same. Perhaps we should mark the difference, and spell them <co:t> and <co:rt>. But now SSBE speakers still have 2 words that sound the same and for some reason have different spellings.
How many vowels?
Try saying the following words aloud: TRAP, PALM, LOT, THOUGHT. How many different vowels was that?
TRAP | PALM | LOT | THOUGHT | |
---|---|---|---|---|
Southern British | a | ɑ: | ɔ | o: |
Leeds | a | a: | ɔ | ɔ: |
Scottish | a | ɔ | ||
General American (west) | æ | ɑ | ||
General American (east) | æ | ɑ | ɔ |
For speakers of Southern British English, there are 4, differing in both quality and length. PALM and THOUGHT have long vowels, TRAP and LOT have short.
For speakers of Leeds English, there are also 4, though these differ only in length: PALM is a long version of TRAP, and THOUGHT a long version of LOT.
For speakers of Scottish English, there are only two: the same vowel that Leeds and Southern British English have in TRAP is also used in PALM, and the one from LOT is used in THOUGHT.
General American 'West' speakers also have only 2 vowels for these 4 words, but don't put the split in the same place as speakers of Scottish English. They pronounce PALM, LOT and THOUGHT all with the same vowel, the short equivalent of the Southern English PALM.
General American 'East' speakers have 3 vowels. The same vowel for TRAP as 'West' speakers, and the same vowel for PALM and LOT, but THOUGHT is pronounced with [ɔ], the LOT vowel of Southern British and Leeds speech.
So if we decide to stop using <a>, <al>, <o> and <ough> for these vowels - which we can all agree is a ridiculous system - what do we use instead?
Now if we write <trap>, <paam>, <lot> and <thoot>, the Leeds speakers will be happy. Southern British speakers will have to learn that "you pronounce aa like [ɑ:], and oo like [o:]", but that's not too complicated a rule to learn. Southern British and Leeds speakers have the same system of vowels, they just have a little phonetic variation.
But what about American speakers? Why do they now have to write PALM with <aa> but LOT with <o>? After all, they pronounce those both the same. So the American system will perhaps have <trap>, <pom>, <lot>, and <thot> - though that's a bit confusing for the speakers who currently pronounce LOT and THOUGHT differently. Why should words like COT and CAUGHT be spelt the same when they are pronounced differently? Why is their writing system now ambiguous? So maybe there should be another vowel for THOUGHT, perhaps <oo> like the British.
But then how will 'West' speakers know when to write <o> and when to write <oo>? After all, COT and CAUGHT are pronounced the same in their dialect.
Just as in the current system, they would have to memorise it for each word.
But what about American speakers? Why do they now have to write PALM with <aa> but LOT with <o>? After all, they pronounce those both the same. So the American system will perhaps have <trap>, <pom>, <lot>, and <thot> - though that's a bit confusing for the speakers who currently pronounce LOT and THOUGHT differently. Why should words like COT and CAUGHT be spelt the same when they are pronounced differently? Why is their writing system now ambiguous? So maybe there should be another vowel for THOUGHT, perhaps <oo> like the British.
But then how will 'West' speakers know when to write <o> and when to write <oo>? After all, COT and CAUGHT are pronounced the same in their dialect.
Just as in the current system, they would have to memorise it for each word.
A minor digression into orthography design
The Latin alphabet, which we use for English, only had 5 vowel symbols, because Latin only had 5 vowels.
English has around 20 vowels, depending on the dialect, so we'll probably need more for a proper reform: we'll not just need to fix the spelling, but introduce new symbols. New keyboards, new text inputting software, new fonts, new alphabet charts, a complete re-education of the existing literate populace... this could be an expensive project.
An easier solution would be to combine the existing 5 vowels, since pairs of vowels give us around 25 symbols. Problematic in words like <hiatus>, where we have two vowels next to one another, but solvable. Maybe all vowels are a pair of symbols, and we never use the original 5 on their own. Unfortunately, we'll have to leave our one sound, one symbol ideal.
Loss of information
Even within a single dialect, at a single point in time, with a modified writing system which is purely phonological, our alphabet doesn't capture all of the information in speech.
Time vs thyme
More common words are pronounced with shorter duration. But Gahl (2008) found that this effect only applies to the word itself - not to its homophones. So <time> is shortened, because it's very common, but <thyme> is not. So the shortening is not just because you are more familiar with the motions needed to pronounce the word.
Other missing cues
Duration is just one of many cues that we have in speech but lack in writing. Others include stress (compare "a permit" to "to permit"), and intonation (try saying "It's over there" as if it were news, then as if you had said it 3 times already). Leaving aside purely auditory cues, there are things like facial expressions and context which make a big difference.
An example dialogue:
"Pass me the bay leaves."
"Heh, for a second I thought you said Bailey's."
"Why would you mishear bay leaves as baileys? How would I put a castle in this? Oh, Bailey's."
Both participants could correct their misunderstandings from context alone, but also viewing confusion on your listener's face allows you to rephrase. Writing lacks this ability to correct for your audience.
We lose a lot of disambiguating information when we are faced with only the written form. Spelling homophones differently lets us disambiguate in other ways, just like we don't need to pronounce smilies, but still find them useful in text.
What's the point?
So it turns out that we do need to answer the question: what is the aim of the spelling reform?
Phonetic system...
If you want everyone to be able to faithfully represent the way they speak, then we have a system for that: it's called the IPA. It's not perfect, but it's certainly good enough for non-technical use. Why not teach everyone to be able to use it?...or standard of communication?
Do you want everyone to be able to communicate with each other, without having to spend lots of effort deciphering dialectal variation? In which case, you will have to pick a single dialect to represent. Of course, it will be Standard Southern British English (at least in the UK), because it has been for centuries. There are myriad social problems with enforcing this hierarchy onto not just vocabulary and grammar choice (as we already do), but also the minutiae of how words are pronounced. If having a more phonetic writing system truly helps with children's education that much, how fair is it to give Southerners an advantage?
If we'd had this spelling reform 20 years ago, we'd now have 2 spellings to reflect a difference that (mostly) only older speakers still make, and then only in a few dialects.
After all, how did we end up with words like <knight> in the first place? Because once upon a time, we had words that started with /kn/, and ended with /xt/. (/x/ is pronounced like <ch> in <Bach>.) The pronunciation drifted, but the spelling didn't, because older writers wanted to communicate with younger writers, and younger writers wanted to be able to read the documents of the past.
This is still applicable today - would we really update every piece of text currently written in English to a new system? Doubtful. Is the gain of a more logical writing system worth the loss of our historical understanding?
The people who don't benefit
There do exist people who care about people spelling 'correctly', not because they think the English spelling is already perfect, but because consistent spelling and punctuation genuinely make reading easier for them.
Consistent, not in the sense of agreeing with the pronunciation, but in the sense of agreeing with their memorised representation of that meaning. <There> and <their> are different words with different meanings, and if the wrong one is used, these individuals will understand the wrong meaning, and have to backtrack when it becomes obvious that something doesn't make sense.
Consistent, not in the sense of agreeing with the pronunciation, but in the sense of agreeing with their memorised representation of that meaning. <There> and <their> are different words with different meanings, and if the wrong one is used, these individuals will understand the wrong meaning, and have to backtrack when it becomes obvious that something doesn't make sense.
Not everyone pronounces words as they read (out loud or in their heads); some people go straight from written form to meaning. For them, using a new spelling (especially if it is already in use for a different word) will make reading more difficult.
At this point, who is it that we should aim to please? That is a question regarding numbers in each group, cost of reform, size of reading fluency changes for each group, and many other factors that I don't have data for.
Conclusion
The answer may well end up that spelling reform is worth it; that it advantages more people than it disadvantages; that we can answer the questions of phonology, morphology, and sociolinguistic status. But it seems to me that we've got a lot more research to do before we don't just make things worse through tinkering.
As xkcd points out:
I leave you with the fact that <salmon> used to be spelled as it is pronounced, without an <l>, until someone decided that it would be better if it looked more like the Latin, because Latin Is Better.
Are we sure we know what we're doing?
Are we sure we know what we're doing?
References
Gahl, Susanne. "Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech." Language 84, no. 3 (2008): 474-496. Link.
Coulmas, Florian. Writing systems: An introduction to their linguistic analysis. Cambridge University Press, 2003.
(A good read for anyone interested in, well, the linguistic analysis of writing systems. The source of the salmon fact.)
(A good read for anyone interested in, well, the linguistic analysis of writing systems. The source of the salmon fact.)