Recent studies show that by 11 months, language experience constrains infantsĀ¹ detection of audio-visual correspondences in non-native speech (Pons et al., 2009; Best et al., 2010, 2011). But what is it that they have learned about native AV relations, and how does it bias their responses to non-native speech? An array of theoretical views offer differing hypotheses. The Learned Association view posits that AV associations require experience to develop, i.e., should emerge in older infants only for audio and visual speech patterns that co-occur in native speech (e.g., Kuhl & Meltzoff, 1982; Massaro, 1998). The Intersensory Narrowing account assumes young infants are innately sensitive to intersensory relations in speech, but experience narrows intersensory perception such that older infants remain sensitive only to native AV correspondences (Pons et al., 2009). The Amodal Articulatory Perception account premise is that infants perceive information about the actions of the speech articulators. They can distinguish speech segments by different articulators (between-organ) even in non-native speech, but to distinguish contrasts made by a single articulator (within-organ) they must experience them in native speech (Best & McRoberts, 2003; Goldstein & Fowler, 2003; Tyler et al., 2008). The Perceptual Assimilation Model (PAM: Best, 1995) posits instead developmental changes in infant perception of non-native contrasts depends on whether they perceive them as speech as nonspeech sounds. We therefore tested English-learning infantsĀ¹ perception of cross-modal articulator congruency in one native and two nonnative between-organ contrasts, one of which English adults hear as speech (ejective stops), the other as nonspeech (clicks). By the Learned Association view, older infants should detect congruency more reliably than younger infants for native speech contrasts, and neither age should do so for nonnative contrasts. But if Intersensory Narrowing is correct, younger infants should detect congruency in native and nonnative contrasts, while older infants should do so only for the native contrast. The Amodal Articulatory account predicts sensitivity to articulator congruency in all between-organ contrasts at both ages, whereas the PAM account predicts different developmental trajectories for non-native contrasts perceived as speech versus contrasts perceived as nonspeech. Results for the nonnative ejective and native contrast support the Amodal Articulatory account, whereas results for the clicks and for the nonnative within-organ contrast used in the prior report (Pons et al., 2009), support the Intersensory Narrowing account. The difference in findings for the two non-native contrasts is compatible with PAM. The possibility of an integrated account combining those three views will be discussed.