OmniMouse: Scaling properties of multi-modal, multi-task Brain Models on 150B Neural Tokens
In a groundbreaking study recently published on arXiv, researchers have presented
a transformative model known as OmniMouse, which explores the scaling properties of
multi-modal, multi-task brain models utilizing an extensive dataset of neural recordings.
This research provides profound insights into the relationship between data scaling and
artificial neural networks, particularly concerning brain activity modeling.
The study leverages a remarkable dataset comprising 3.1 million neurons sourced from
the visual cortex of 73 mice, collected over 323 sessions. This extensive dataset
totals more than 150 billion neural tokens recorded during various stimuli, including
natural movies, images, and behavioral interactions. The primary aim of this research
is to understand whether the principles that have driven advancements in language and
vision AI can also be applied to the modeling of brain activity.
Key Findings
The OmniMouse model is designed to support three flexible regimes at test time:
neural prediction, behavioral decoding, and neural forecasting. Additionally, it
can seamlessly integrate any combination of these three tasks. The results of
the study highlight several key findings:
-
State-of-the-Art Performance: OmniMouse has demonstrated
superior performance, outperforming specialized baselines across nearly
all evaluation regimes. -
Data Scaling: Performance improvements scale reliably with
the volume of data. This finding aligns with existing knowledge in AI but
introduces complexities regarding model size. -
Saturation of Gains: Unlike the traditional AI scaling
narrative, in which increasing model size drives significant progress,
the study reveals that gains from larger model sizes tend to saturate
in brain modeling contexts. -
Data Limitation: Even with vast recordings from a relatively
simple system, such as the mouse visual cortex, models remain data-limited.
This contrasts sharply with advancements seen in language and computer vision. -
Phase Transitions: The research suggests the possibility of
phase transitions in neural modeling. It indicates that larger and richer
datasets might unlock new capabilities, similar to the emergent properties
observed in large language models.
Implications for Future Research
The findings from the OmniMouse study have significant implications for future
research in neuroscience and AI. The consistent scaling observed raises important
questions about how to harness larger datasets effectively and what new
methodologies could be developed to optimize model performance. Furthermore,
this research may pave the way for enhanced brain-computer interfaces and
a deeper understanding of neural mechanisms.
Researchers interested in exploring the code and methodologies behind OmniMouse
can access it on GitHub at https://github.com/enigma-brain/omnimouse.
