John Kingston, Shigeto Kawahara, Della Chambless, Michael Key, Daniel Mash, and Sarah Watsky
Three experiments are reported that collectively show that listeners perceive speech sounds as contrasting auditorily with neighboring sounds. Experiment 1 replicates the well-established finding that listeners categorize more of a [d–g] continuum as [g] after [l] than after [r]. Experiments 2 and 3 show that listeners discriminate stimuli in which the energy concentrations differ in frequency between the spectra of neighboring sounds better than those in which they do not differ. In Experiment 2, [alga–arda] pairs, in which the energy concentrations in the liquid-stop sequences are H(igh) L(ow)–LH, were more discriminable than [alda–arga] pairs, in which they are HH–LL. In Experiment 3, [da] and [ga] syllables were more easily discriminated when they were preceded by lower and higher pure tones, respectively—that is, tones that differed from the stops’ higher and lower F3 onset frequencies—than when they were preceded by H and L pure tones with similar frequencies. These discrimination results show that contrast with the target’s context exaggerates its perceived value when energy concentrations differ in frequency between the target’s spectrum and its context’s spectrum. Because contrast with its context does more that merely shift the criterion for categorizing the target, it cannot be produced by neural adaptation. The finding that nonspeech contexts exaggerate the perceived values of speech targets also rules out compensation for coarticulation by showing that their values depend on the proximal auditory qualities evoked by the stimuli’s acoustic properties, rather than the distal articulatory gestures.