User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation
Summary: arXiv:2604.03014v1
Announce Type: Cross
Abstract
Multi-modal recommendation (MMR) enriches item representations by introducing item content, such as visual and textual descriptions, to improve upon interaction-only recommenders. The success of MMR hinges on aligning these content modalities with user preferences derived from interaction data. However, dominant practices that seek to disentangle modality-invariant preference-driving signals from modality-specific preference-irrelevant noises have been found to be flawed.
Key Issues Identified
- Assumption of Uniformity: Current methodologies assume a one-size-fits-all relevance of item content to user preferences, disregarding the individual nature of user preferences.
- Neglect of Higher-Order Dependencies: Existing models optimize pairwise contrastive losses separately for cross-modal alignment, systematically ignoring the complex relationships among multiple content modalities that jointly influence user choices.
Introducing GTC
In response to the limitations of existing approaches, we introduce GTC, a conditional Generative Total Correlation learning framework. This innovative framework leverages an interaction-guided diffusion model to effectively perform user-aware content feature filtering. The goal is to preserve only those personalized features that are relevant to each individual user.
Methodology Highlights
- User-Aware Content Feature Filtering: GTC focuses on filtering content features based on individual user preferences, ensuring that the recommendations are tailored to specific user needs.
- Cross-Modal Dependency Capture: The framework optimizes a tractable lower bound of the total correlation of item representations across all modalities, enabling a complete understanding of cross-modal dependencies.
Experimental Results
Extensive experiments conducted on standard MMR benchmarks demonstrate that GTC consistently outperforms state-of-the-art techniques. The results indicate significant improvements, with gains of up to 28.30% in NDCG@5 compared to existing models.
Ablation Studies
Further validation through ablation studies confirms the effectiveness of both conditional preference-driven feature filtering and total correlation optimization. These findings underscore GTC’s capability to model user-conditional relationships in MMR tasks.
Access the Code
For those interested in exploring the GTC framework, the code is available at the following link: https://github.com/jingdu-cs/GTC.
