The language(s) that we know shape the way we process and represent the speech that we hear. Since real-world speech recognition almost always takes place in conditions that involve some sort of background noise, we can ask whether the influence of linguistic knowledge and experience on speech processing extends to the particular challenges posed by speech-in-noise recognition, specifically the perceptual separation of speech from noise (Experiment Series 1) and the cognitive representation of speech and concurrent noise (Experiment Series 2). In Experiment Series 1, listeners were asked to recognize English sentences embedded in a background of competing speech that was either English (matched-language, English-in-English recognition) or another language (mismatched-language, e.g. English-in-Mandarin recognition). Listeners were either native or non-native listeners of the target language (usually, English), and were either familiar or unfamiliar with the background language (English, Mandarin, Dutch, or Croatian). This series of experiments demonstrated that matched-language is substantially harder than mismatched-language speech-in-speech recognition. Moreover, the magnitude of the mismatched-language benefit was modulated by long-term linguistic experience (specifically, listener familiarity with the background language), as well as by short-term adaptation to a consistent background language within a test session. Thus, we conclude that speech recognition in conditions that involve competing background speech engages higher-level, experience-dependent, language-specific knowledge in addition to general lower-level, signal-dependent processes of auditory stream segregation. Experiment Series 2 then investigated perceptual classification and encoding in memory of spoken words and concurrently presented background noise. Converging evidence from eye-tracking-based time-course, speeded classification, and recognition memory paradigms strongly suggests parallel (rather than strictly sequential) processes of stream segregation and word identification, as well as integrated (rather than segregated) cognitive representations of speech presented in background noise. Taken together, this research is consistent with models of speech processing and representation that allow interactions between long-term, experience-dependent linguistic knowledge and instance-specific, environment-dependent sources of speech signal variability at multiple levels, ranging from relatively early/low levels of selective attention to relatively late/high levels of lexical encoding and retrieval.