The notion that children use statistical distributions present in the input to acquire various aspects of linguistic knowledge has received considerable recent attention. But the roles of learner's initial state have been largely ignored in those studies. What remains unclear is the nature of learner's contribution. At least two possibilities exist. One is that all that learners do is to collect and compile accurately predictive statistics from the data, and they do not have antecedently specified set of possible structures (Elman, et al. 1996; Tomasello 2000). On this view, outcome of the learning is solely based on the observed input distributions. A second possibility is that learners use statistics to identify particular abstract syntactic representations (Miller & Chomsky 1963; Pinker 1984; Yang 2006). On this view, children have predetermined linguistic knowledge on possible structures and the acquired representations have deductive consequences beyond what can be derived from the observed statistical distributions alone. This dissertation examines how the environment interacts with the structure of the learner, and proposes a linking between distributional approach and nativist approach to language acquisition. To investigate this more general question, we focus on how infants, adults and neural networks acquire the phrase structure of their target language. This dissertation presents seven experiments, which show that adults and infants can project their generalizations to novel structures, while the Simple Recurrent Network fails. Moreover, it will be shown that learners' generalizations go beyond the stimuli, but those generalizations are constrained in the same ways that natural languages are constrained. This is compatible with the view that statistical learning interacts with inherent representational system, but incompatible with the view that statistical learning is the sole mechanism by which the existence of phrase structure is discovered. This provides novel evidence that statistical learning interacts with innate constraints on possible representations, and that learners have a deductive power that goes beyond the input data. This suggests that statistical learning is used merely as a method for mapping the surface string to abstract representation, while innate knowledge specifies range of possible grammars and structures.