QA-MoE: Towards a Continuous Reliability Spectrum with Quality-Aware Mixture of Experts for Robust Multimodal Sentiment Analysis
Summary: arXiv:2604.05704v1 Announce Type: new
Abstract: Multimodal Sentiment Analysis (MSA) aims to infer human sentiment from textual, acoustic, and visual signals. In real-world scenarios, however, multimodal inputs are often compromised by dynamic noise or modality missingness. Existing methods typically treat these imperfections as discrete cases or assume fixed corruption ratios, which limits their adaptability to continuously varying reliability conditions. To address this, we first introduce a Continuous Reliability Spectrum to unify missingness and quality degradation into a single framework. Building on this, we propose QA-MoE, a Quality-Aware Mixture-of-Experts framework that quantifies modality reliability via self-supervised aleatoric uncertainty. This mechanism explicitly guides expert routing, enabling the model to suppress error propagation from unreliable signals while preserving task-relevant information. Extensive experiments indicate that QA-MoE achieves competitive or state-of-the-art performance across diverse degradation scenarios and exhibits a promising One-Checkpoint-for-All property in practice.
Introduction
In recent years, the field of Multimodal Sentiment Analysis (MSA) has gained significant attention due to its potential applications in various domains, including social media monitoring, customer feedback analysis, and human-computer interaction. However, MSA faces challenges in dealing with noisy or incomplete input data, which can severely affect model performance.
The Challenge of Reliability in MSA
Traditional approaches in MSA have often categorized data imperfections into fixed corruption types, leading to a lack of flexibility in handling real-world scenarios. Examples of these imperfections include:
- Missing audio signals in a video clip.
- Blurred images that hinder visual analysis.
- Textual data with typographical errors or missing words.
This rigid approach has made it difficult for models to adapt to the continuously varying reliability of different modalities.
Introducing the Continuous Reliability Spectrum
The research introduces a novel framework known as the Continuous Reliability Spectrum, which serves to integrate the concepts of missingness and quality degradation. This framework allows for a more nuanced understanding of how different modalities contribute to sentiment analysis in varying contexts.
QA-MoE Framework
At the core of this research is the QA-MoE, which stands for Quality-Aware Mixture-of-Experts. This innovative framework enhances the sentiment analysis process by:
- Quantifying the reliability of each modality through self-supervised learning techniques.
- Utilizing aleatoric uncertainty to determine how much trust should be placed on a given input signal.
- Effectively routing expert models based on the assessed reliability, thus limiting the influence of unreliable data sources.
Experimental Results
The extensive experiments conducted in this study reveal that QA-MoE not only performs competitively but also achieves state-of-the-art results across various degradation scenarios. Notably, it displays a One-Checkpoint-for-All property, simplifying model deployment and improving efficiency.
Conclusion
In summary, the QA-MoE framework represents a significant advancement in the field of Multimodal Sentiment Analysis. By addressing the challenges of reliability and quality degradation, it sets a new standard for future research and applications in sentiment analysis, promoting greater adaptability and robustness in real-world scenarios.
