User-Aware Generative Learning for Multi-Modal Recommendations

Date:


User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation

Summary: arXiv:2604.03014v1

Announce Type: Cross

Abstract

Multi-modal recommendation (MMR) enriches item representations by introducing item content, such as visual and textual descriptions, to improve upon interaction-only recommenders. The success of MMR hinges on aligning these content modalities with user preferences derived from interaction data. However, dominant practices that seek to disentangle modality-invariant preference-driving signals from modality-specific preference-irrelevant noises have been found to be flawed.

Key Issues Identified

  • Assumption of Uniformity: Current methodologies assume a one-size-fits-all relevance of item content to user preferences, disregarding the individual nature of user preferences.
  • Neglect of Higher-Order Dependencies: Existing models optimize pairwise contrastive losses separately for cross-modal alignment, systematically ignoring the complex relationships among multiple content modalities that jointly influence user choices.

Introducing GTC

In response to the limitations of existing approaches, we introduce GTC, a conditional Generative Total Correlation learning framework. This innovative framework leverages an interaction-guided diffusion model to effectively perform user-aware content feature filtering. The goal is to preserve only those personalized features that are relevant to each individual user.

Methodology Highlights

  • User-Aware Content Feature Filtering: GTC focuses on filtering content features based on individual user preferences, ensuring that the recommendations are tailored to specific user needs.
  • Cross-Modal Dependency Capture: The framework optimizes a tractable lower bound of the total correlation of item representations across all modalities, enabling a complete understanding of cross-modal dependencies.

Experimental Results

Extensive experiments conducted on standard MMR benchmarks demonstrate that GTC consistently outperforms state-of-the-art techniques. The results indicate significant improvements, with gains of up to 28.30% in NDCG@5 compared to existing models.

Ablation Studies

Further validation through ablation studies confirms the effectiveness of both conditional preference-driven feature filtering and total correlation optimization. These findings underscore GTC’s capability to model user-conditional relationships in MMR tasks.

Access the Code

For those interested in exploring the GTC framework, the code is available at the following link: https://github.com/jingdu-cs/GTC.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.