Dual-branch Graph Domain Adaptation for Emotion Recognition

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Summary: arXiv:2603.26840v1 Announce Type: cross

Abstract

Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers’ emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation scenarios differ significantly in speakers, topics, styles, and noise levels. Existing MERC methods generally neglect these cross-scenario variations, limiting their ability to transfer models trained on a source domain to unseen target domains.

Introduction

To address the challenges posed by varying conversation scenarios, we propose a Dual-branch Graph Domain Adaptation framework (DGDA) for multimodal emotion recognition under cross-scenario conditions. Our innovative approach is designed to enhance the model’s ability to generalize across diverse contexts, ultimately improving the accuracy of emotion recognition in varied conversational environments.

Methodology

The DGDA framework comprises several key components:

Emotion Interaction Graph: We first construct an emotion interaction graph that characterizes complex emotional dependencies among utterances. This graph captures the subtle nuances of emotional exchanges in conversations.
Dual-branch Encoder: Our dual-branch encoder consists of a Hypergraph Neural Network (HGNN) and a Path Neural Network (PathNN). The HGNN explicitly models multivariate relationships among emotions, while the PathNN captures global dependencies in the dialogue.
Domain Adversarial Discriminator: To enable out-of-domain generalization, we introduce a domain adversarial discriminator that learns invariant representations across domains. This component ensures that the model remains robust despite variations in the conversation contexts.
Regularization Loss: To mitigate the impact of noisy labels, we incorporate a regularization loss that suppresses negative influences, allowing for more accurate emotion predictions.

Results

To the best of our knowledge, DGDA is the first MERC framework that jointly addresses domain shift and label noise. The theoretical analysis of the framework provides tighter generalization bounds, showcasing its effectiveness in adapting to new scenarios. Extensive experiments conducted on two benchmark datasets, IEMOCAP and MELD, demonstrate that DGDA consistently outperforms strong baselines, proving its capability in handling cross-scenario conversations.

Conclusion

The findings highlight the importance of addressing domain adaptation and label noise in multimodal emotion recognition tasks. DGDA not only enhances the performance of emotion recognition models but also sets a precedent for future research in adapting machine learning models to varied real-world scenarios. Our code is available for public access at https://github.com/Xudmm1239439/DGDA-Net.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Dual-branch Graph Domain Adaptation for Emotion Recognition

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Abstract

Introduction

Methodology

Results

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related