Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning
Summary: arXiv:2604.05243v1 Announce Type: cross
Abstract
Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the kind of feature that tends to define object categories — a second-order generalisation known as an overhypothesis. This research investigates the learning mechanisms that enable such inductive leaps in early word learning.
Methods
To explore the capabilities of different learning models, we trained autoregressive transformer language models with parameter sizes ranging from 3.4 million to 25.6 million. These models were exposed to synthetic corpora designed with a stable feature dimension of shape across various object categories. Eight different experimental conditions were implemented to control for alternative explanations and ensure the robustness of our findings.
Results
In a comprehensive evaluation involving 120 pre-registered runs, each model was assessed using a 1,040-item wug test battery. Notably, all models achieved perfect first-order exemplar retrieval rates of 100%. However, when it came to second-order generalisation to novel nouns, performance remained at chance levels, ranging between 50% and 52%. This result was further validated through equivalence testing, reinforcing the significance of our findings.
Feature-Swap Diagnostic
A feature-swap diagnostic was conducted to evaluate the mechanisms employed by the models. The results indicated that the models predominantly relied on frame-to-feature template matching rather than developing a structured noun-to-domain-to-feature abstraction. This reliance suggests a fundamental limitation in the models’ capacity to generalise beyond first-order relationships.
Conclusions
The findings from this study reveal critical insights into the limitations of autoregressive distributional sequence learning, particularly under conditions that simulate developmental scales. While the models excelled in retrieving specific exemplars, their struggle to generalise to new nouns highlights challenges in replicating the nuanced learning capabilities observed in human children. Future research should explore alternative architectures and training paradigms that may better capture the complexities of early word learning, potentially leading to improved models that can bridge the gap between first-order retrieval and second-order generalisation.
Implications for Future Research
- Investigate alternative neural architectures that could enhance generalisation capabilities.
- Explore the role of contextual information in shaping object category understanding.
- Assess the impact of different training datasets on learning outcomes.
- Examine the integration of multimodal inputs to enrich learning experiences.
This study not only contributes to our understanding of machine learning mechanisms but also poses significant questions about the nature of cognitive development in children, offering a bridge for interdisciplinary research between artificial intelligence and developmental psychology.
