Well. Without the hyperbole, he and his colleagues have discovered some factors which highly correlate with whether sound pairs have merged together or split apart in the history of a few current languages. And very interesting they are too, so let's have a look.
Background: sound change and inventories
As anyone who has tried to learn a foreign language will know, not all languages make use of the same sounds.
The number of distinct sounds a language can have varies from 11 to perhaps 140+ (there is quite a lot of debate about how best to describe !Kung). For comparison, English has 24 consonants and around 20 vowels (it varies massively between dialects).
These sounds do not stay constant over time - you only need to listen to BBC presenters from the 1950's to hear how "standard" British English has changed. But it is not just minor details of how the sounds are pronounced which varies - it can also be which sounds appear in the language at all. These abstract ideas of sounds - independent of exactly how they are pronounced - are called phonemes.
Let's take a look at some examples of English gaining and losing phonemes.
A tour through Old English
For example, let's take some irregular plurals of English: leaf and leaves, wife and wives, hoof and hooves. Why does the 'f' change to a 'v'?
Well, because in Old English 'f' and 'v' were considered the same sound. In Old English, you simply pronounced 'f' as 'v' when it showed up between two vowels, as in leaf-es and wif-es and hoof-es.
(In a similar way, in Modern English we pronounce the 'plural s' as 'z' when it shows up after a vowel: bees are is pronounced bee-zar, not bee-sar. Most native speakers are completely oblivious to this.)
So what changed? We stole the 'v' sound from the French. When the Normans invaded, we suddenly acquired words like 'village' and 'veal'. Those 'v' sounds aren't between two vowels. What to do? How can we predict when 'f' will be pronounced as 'f', and when it will be turned into a 'v'? We can't - so we've got a new phoneme for the language, which we have to learn like every other phoneme.
An alternative tour through Modern English
Speakers of most dialects of English pronounce the words cot and caught very differently. Cot rhymes with not, rot, hot, lot. Caught rhymes with bought, taught, thought - and if you have a Standard British accent, also with sort, port, wart.
However, speakers of many dialects - noticeably in the western United States - pronounce both of these sounds the same. Where speakers used to have two different phonemes, in these dialects they now only have one. (Usually sounding something like a long version of cot.)
So when does sound change happen?
When do you gain extra phonemes? Just when the Normans invade? And when do you lose existing phonemes?
The answers to those questions are nowhere near settled. (Obviously it's not only the Normans who cause sound changes, but how important is interacting with speakers of other languages?) But we do now have a good predictor of whether two similar sounds - like 'f' and 'v' - are likely to get closer together and merge into the same sound, or get farther apart and split into phonemes in their own right, thanks to a study by Wedel, Kaplan and Jackson.
The study
Wedel et al. created a corpus from 9 languages: English (Received Pronunciation and Standard American), Korean, French,German, Dutch, Slovak, Spanish, and Hong Kong Cantonese. For each of these languages there is a phonemically transcribed word list available (see this post for my list of links to such word lists). In other words, the contrasting sounds of each word are represented, instead of its spelling.
For each phoneme pair in a language, they recorded whether or not it had merged in another dialect of the language - like cot/caught has. In total, their dataset contained 56 phoneme pairs that have merged somewhere, and 578 that have not. Then they looked at factors which predicted where these splits and mergers were mostly likely to occur.
The main factor was functional load.
Functional load
Functional load is a measure of how much work a phoneme is doing in a language. Wedel et al (under review) looked at several ways of measuring this, but in the end found that the simplest was also the best: how many pairs of words depend on two phonemes being different?
A pair of words which only differ by a single sound is called a minimal pair. There are lots of minimal pairs in English which depend on the difference between p and t: pin/tin, pill/till, pen/ten, pun/ton, pan/tan, pool/tool, pop/top, pear/tear, pipe/type, peel/teal, par/tar, pact/tact,...
But the difference between th and f is much less important. Pulling out the one-syllable words from a little dictionary:
three/free, thresh/fresh, thrill/frill, thorn/faun, thread/Fred, thaw/for, thin/fin
but
thank/-, thatch/-, thief/-, theme/-, thick/-, thigh/-, thing/-, think/-, third/-, threat/-, throat/-, throne/-, through/-, throw/-, thud/-, thug/-, thumb/-, thump/-
I would guess there are only a couple of dozen pairs of words in my whole vocabulary which depend on the difference between f and th, whereas I would guess there are hundreds which depend on the p/t distinction. So p/t has a higher functional load: it is bearing more of the weight of word distinctions than f/th.
Functional load and merger
Wedel et al. found that the more minimal pairs depended on a phoneme contrast, the less likely the phonemes were to merge together. And if a phoneme pair only distinguished a few minimal pairs, then it was much more likely to merge into a single sound.
For this reason, you may have heard - or be yourself - a speaker of English who pronounces all th sounds as f: there are very few words for which this will cause confusion. However, I have never heard anyone pronounce all p sounds as t!
Other factors
Wedel et al. found a few other factors which also played a role - we will just look quickly at syntactic category.
Syntactic category
The syntactic category of a word is whether it is a noun or a verb or an adjective or ... .
If two words are in the same category, it is more important to hang on to the distinction between them. If you pronounce the noun "pin" as the noun "tin" - "Pass me the tin", "I want a red tin" - then you will quickly cause mass confusion.
However, if from now on you pronounce the adjective "thin" as the noun "fin" - "I need a fin piece of wood", "Pass me the fin one" - no-one is going to think that you are talking about dolphins.
Wedel. et al found that functional load within a syntactic category was more important than functional load across different categories.
If two words are in the same category, it is more important to hang on to the distinction between them. If you pronounce the noun "pin" as the noun "tin" - "Pass me the tin", "I want a red tin" - then you will quickly cause mass confusion.
However, if from now on you pronounce the adjective "thin" as the noun "fin" - "I need a fin piece of wood", "Pass me the fin one" - no-one is going to think that you are talking about dolphins.
Wedel. et al found that functional load within a syntactic category was more important than functional load across different categories.
Conclusion
So just as common sense would predict, speakers are more likely to keep the distinction between two sounds when
- lots of words are distinguished by the difference
- ...particularly words where you can't figure it out from the grammar
But it is always fantastic when common sense is backed up by data. This is a really cool result.
Further reading
For readers who are interested in the more technical details, here is a quote from Wedel, Jackson and Scott, discussing what other factors play a role:
"We present evidence that within our dataset,
(i) the number of minimal pairs counted by lemma rather than by surface-form (which may include affixes) is more predictive of merger probability;
(ii) the number of minimal lemma pairs that share syntactic category is a stronger predictor of merger than the number of those with divergent syntactic categories, and
(iii) that the number of minimal lemma pairs with members of similar frequency is a stronger predictor of merger than the number of those with more divergent frequencies."
"We present evidence that within our dataset,
(i) the number of minimal pairs counted by lemma rather than by surface-form (which may include affixes) is more predictive of merger probability;
(ii) the number of minimal lemma pairs that share syntactic category is a stronger predictor of merger than the number of those with divergent syntactic categories, and
(iii) that the number of minimal lemma pairs with members of similar frequency is a stronger predictor of merger than the number of those with more divergent frequencies."
And links to Andrew Wehel's papers can be found here.