Frequently Asked Questions

1. How come segmental descriptions work so well to represent speech, if your anti-segment position is correct?
    I have several responses. First, segments do not do that well. They represent almost nothing about temporal properties of speech, for example. And some segments represent articulatory target states ([a], [d], etc) while others represent motions ([w, j], etc.). And because they are letter-like units they are far more easily permutable than actual speech sounds. Constraining the perfect permutability of letters is the primary task of formal phonology. It seems misguided to postulate completely permutable units when only a minute fraction of the permutations are plausible words. (See my chapter ``Graphical basis of phones and phonemes'' for a fuller argument about limits on permutation.)
    Second, they seem to work very well for us, I agree, but only after we have received years of literacy training. Don't forget all those years starting from age 2-4 into adulthood when you were trained for many hours every week to read and write skillfully using an alphabet. That is the ONLY reason they are easy and natural for us to use.
    Third, letters were engineered as a scheme for representing language for people who already speak the language. So, if retrieving the appropriate allophonic variant is essentially automatic (after a few years of training), then phonemes seem like a good idea. But the efficiency constraints related to writing language on paper are very different from the efficiency constraints on a human brain.

2. Perhaps phonetic representations need more detail, but is there any evidence against the proposal that speakers use a hybrid representation consisting of BOTH a continuous-time, rich and detailed representation and ALSO a segmental phonetic or phonological representation much like that traditionally proposed?
    I completely agree that many speakers employ both kinds of representation. Certainly all of us alphabet-literate people have both kinds of mental representation for language. But consider this: What is the evidence for abstract, segmental representations? It is ENTIRELY (I claim) intuitive. What is this segmental representation used for psychologically? -- that is, for what functions is it required in our processing of spoken language? Any language can be spoken perfectly with no knowledge of writing. I have been able to find no functions aside from those of reading and writing (and, of course, for all our conscious analytical thoughts about language). If it is true that alphabetical representations are only relevant for literacy purposes and for the conscious linguistic ruminations of literate people (like us linguists), then clearly these representations play only a very peripheral and supplementary role in the human use of language. As far as I can tell, they are irrelevant to basic human skills in speech communication. It is as though all the real linguistic processing (eg, recognition and understanding) is done using rich representations and the segmental description is simply tacked on the end, so our conscious representations are all about segments, but those segments play no processing role.

3. Isn't this just another anti-phonology position by a phonetician? Aren't you simply denying that phonology and other mental representations exist?
    I am not denying phonology. But I do insist that phonology requires (A) so many degrees of freedom that it is effectively impossible for the patterns to be summarized by any linguist with paper and pencil using some short list of features and rule formalisms. And (B), there are so many variants to be found for any potential linguistic unit, completion of a rule list is out of the question. (Most generative phonologists seem to wear blinders to keep from looking at all these variants. They refuse to use tape recordings for data and depend on somebody's supposedly `canonical' version of each lexical entry.) So good quality research on phonology is certainly possible, but it will require looking much closer at the data than phonologists are accustomed to doing. The phonology is created by a community of speakers over time. There are phoneme-like units in most languages, of course, but the are properties of the social system. Their shape is accomplished by generations of speakers and their unit-like components do not have unitary status for the real-time processing of speech.

4. What about contrast? You seem to have no way to distinguish sounds that contrast from those that do not.
    That's exactly what I am claiming with respect to realtime processing of language. The idea of contrast is that it provides a criterion for helping the analyst of a language keep the inventory of phonemes (that is, `mental letters') as small as possible without collapsing auditory differences that distinguish words. But since speech memory is very rich, the human memory system does not care what is contrastive and what is not. You will probably remember any details that your perceptual system is able to distinguish. The idea many linguists have that humans need to store language using `minimal representations' is completely misguided. The only situation where we need minimal representations is when we have to learn a graphical symbol for each variant. In this case, the fewer symbols we need to learn, the better.
    Contrast is relevant to phonology when we look at phonological generalizations at the global time scale of generations. English, for example, has approximately the same front vowel distinctions between [m] and [n] (mean, Min, men, mane, man) as it does between [b] and [t] (beat, bit, bet, bate, bat). These distinctions illustrate sound contrasts, of course, and the contrasts are similar between [m__n] and [b__t]. But several points need to be made. First, these similar `sound types' must not be interpreted as tokens that are used to `spell' words in memory. The words are mentally `spelled' very differently than this (using detailed auditory parameters). And, secondly, we must keep in mind that, although we may find approximately the same vowel contrasts in [m__n] and [b__t], we are not guaranteed to find them everywhere, either due to ``accidental gaps'' (eg, try [g__t] as in *, *, get, gate, *) or due to further regularities. For instance, most English speakers have fewer vowel contrasts before [r]. In my case, for the same front vowels before /r/, I have only beer vs. bear -- and mere vs. mare, seer vs. Sare, pier/peer vs. pair/pare, etc.. This is to point out that it is not true that `once a contrast, always a contrast' (as implied by the phoneme concept because it is true of letters). So the notion of contrast is not relevant for language processing or linguistic memory, but is very relevant primarily for orthography design and probably also as a longterm, slow bias in language change that reduces distinctiveness burdens on speakers and hearers.

5. This theory is about phonology. It doesn't have any relevance for syntax and other areas of linguistic theory, does it?
    Sure it does. In a concrete way, Rich Phonology claims that phonetics and phonology do not provide any discreteness. So how can morphology or syntax be guaranteed to be discrete when the phonetic spellings of morphemes and phrases are not? It is difficult to see how this could be. Of course, I am not saying that just because the mental representations of words are not discrete does not, by itself, mean there could not be discreteness in the lexicon or grammar. But it does mean that the casual assumption of universal discreteness is untenable. Any discreteness will have to be explained and justified. Syntacticians generally take the discreteness of lexical and grammatical morphology for granted. This is no longer justifiable.
     At the more abstract level of approaches to cognitive science, Rich Phonology is an attempt to describe language in non-symbolic terms. Symbols of whatever sort are, it seems to me, nearly always metaphorical projections from orthographic notions -- letters, words or sentences.   So just as the phoneme must yield to a high-D description and turns out to be a mere `category' -- that is a set of different things that we agree to treat as the same or call by the same name, so it seems likely that higher level syntactic units (eg, Sentence, Subject, Verb, Preposition, etc.) will also turn out to be mere categories -- not actual objects or tokens that are formally manipulable. Furthermore, just as realtime Rich Phonology depends on statistical regularities in a high-dimensional memory, it seems very likely that syntax too will turn out to depend on much richer memory than we thought.   In fact, the theory of Construction Grammar (eg, Adele Goldberg) appears to offer just such an approach.

6. If you don't believe in phonetic transcription, how are linguists supposed to describe their data?
    Linguists should describe their data using whatever tools seem appropriate and useful. Sometimes orthographic transcription may be sufficient, but more often phonetic transcription is required. In many cases, sound spectrograms or other instrumental representations will be necessary. There is nothing wrong with using phonetic transcription. We are all alphabet-skilled, so such representations are easy for us to interpret and produce. But, of course, for certain issues (eg, like timing patterns and many other things) a transcription does not provide suitable representations. We just need to keep in mind what the limitations of any representational method are. They do NOT represent the psychological form of words and phrases in memory.

7. What advice do you have for a graduate student in linguistics interested in phonology?
    I have 3 kinds of advice. First, don't do a ``theoretical'' thesis. Unfortunately, Optimality Theory is built on a foundation of sand (like all the other formal theories of linguistics). There is no evidence for any such formal system in human cognition (except for those formal systems that we create consciously), so very little of this work will amount to anything in the long run. Instead a student should work on a descriptive problem using whatever notational system seems useful ( eg, the Chomsky-Halle feature system or even notions from OT) as descriptive vocabulary. This way you have the greatest chance to make a long-term contribution despite any limitations due to the descriptive tools you employ.
    Secondly, make sure your research deals with real data - using only reliable segmental transcriptions (if you can find them, although they are few and far between), or much better, begin with tape recordings of real speakers. You can do your own transcriptions or else rely on behavioral research in speech production or speech perception. Too much phonology is dependent on very low quality, sketchy transcriptions that gloss over many data problems (such as massive variation in pronunciations).
    Third, if you are really interested in phonology, take more phonetics courses! That also implies that you take more math -- calculus, dynamical systems, linear algebra, statistics, etc. Various kinds of mathematics will continue to provide the tools we need to stretch our imaginations and lead to productive insights.

If you have any further questions, please contact me. I will do my best to answer.
       Bob Port
         January 11, 2008