Human infants learn spontaneously and effortlessly the language(s) spoken in their environments, but we still have a very poor understanding of the mechanisms underlying this feat. Here, I will present an approach based on the computational modeling of the problem faced by infants. The problem presents itself as the simultaneous, unsupervized, and mutually constraining discovery of linguistic structures at many levels. I will illustrate the notion of learning synergies through several studies on the discovery of phonetic and lexical units from speech, using Natural Language Processing and Speech Technology techniques. I will discuss the consequences and challenges of this approach, in particular, the need to study the learning problem across typologically distinct languages.