Roger Levy · [CSC] Life at the edge of the lexicon: Productive knowledge and direct experience in language processing and acquisition
The infinite generative potential of human language derives from our ability to analyze complex linguistic input into simpler units, store those units in memory, and productively recombine those units into new expressions. This is the cycle of comprehension, acquisition, and production through which human languages persist and change through the history of a speech community. But what are these units of comprehension, acquisition, and production? The tension between combinatorial and holistic representation of complex linguistic expressions plays a central role in debates on language processing and acquisition. Here I describe work combining probabilistic models and new large datasets to investigate this tension and uncover the respective contributions of productive knowledge and direct experience. In processing, we focus on binomial expressions (salt and pepper - pepper and salt), finding a frequency-driven tradeoff between the two knowledge sources and a frequency-dependent level of idiosyncrasy in binomial ordering preference across binomials in the language. The former is explained by a rational model of learning from limited experience; the latter we account for with an evolutionary model of transmission of ordering preferences over time. In acquisition, we focus on determiner-noun combinations (“the ball”, “a cold”) and develop a novel Bayesian model to infer the strength of contribution of productive knowledge evident in child speech. We find evidence of low initial levels of productivity and higher levels later in development, consistent with the hypothesis that the earliest months of multi-word speech are not generated using rich grammatical knowledge, but that grammatical productivity emerges rapidly thereafter.