Modeling statistical insensitivity: Sources of suboptimal behavior

Annie Gagliardi, Naomi Feldman, Jeffrey Lidz

Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the hypothesis that children are classifying nouns optimally with respect to a distribution that does not match the surface distribution of statistical features in their input. We propose three ways in which children's apparent statistical insensitivity might arise, and find that all three provide ways to account for the difference between children's behavior and the optimal classifier. A fourth model combines two of these proposals and finds that children's insensitivity is best modeled as a bias to ignore certain features during classification, rather than an inability to encode those features during learning. These results provide insight into children's developing knowledge of noun classes and highlight the complex ways in which statistical information from the input interacts with children's learning processes.