Parametric Memory Head Boosts Continual Generative Retrieval

A Parametric Memory Head for Continual Generative Retrieval

Recent advancements in generative information retrieval (GenIR) have introduced a novel approach to managing dynamic document collections. A recent paper, titled “A Parametric Memory Head for Continual Generative Retrieval,” published on arXiv (arXiv:2604.23388v1), addresses the challenges faced by traditional GenIR models, which struggle with the stability-plasticity trade-off when adapting to newly added documents.

GenIR simplifies the retrieval process by consolidating it into a single neural model that decodes document identifiers (docids) directly from user queries. However, this single-model architecture presents significant limitations, particularly when it comes to updating knowledge in dynamic environments. Unlike modular systems that can easily refresh their indexes, GenIR relies on the parametric encoding of knowledge within its weights, making it susceptible to catastrophic forgetting during standard adaptation methods like full and parameter-efficient fine-tuning.

The Stability-Plasticity Trade-off

The authors of the paper have highlighted a critical issue in the adaptation of GenIR models. While sequential adaptation can enhance retrieval effectiveness for newly incorporated documents, it often leads to a marked decline in performance regarding earlier documents. This phenomenon underscores the pronounced stability-plasticity trade-off inherent in these models. In essence, as the model adapts to new information, it tends to forget previously learned material, which can severely impact overall retrieval accuracy.

Introducing Post-Adaptation Memory Tuning (PAMT)

To mitigate the challenges associated with this trade-off, the authors propose a novel approach known as post-adaptation memory tuning (PAMT). This method incorporates a modular parametric memory head (PMH) that augments an already adapted model, effectively stabilizing it without altering the backbone architecture. The key innovations of PAMT include the following:

Frozen Backbone: By freezing the backbone of the model, PAMT ensures that the core parameters remain unchanged, which helps maintain stability across various document slices.
Product-Key Memory: The PMH utilizes a fixed addressing mechanism, allowing for efficient querying of memory during the decoding process.
Sparse Querying: During prefix-trie constrained decoding, decoder hidden states can sparsely query the PMH to generate residual corrections in hidden space. These corrections are then translated into score adjustments via a frozen output embedding matrix, ensuring that only trie-valid tokens are considered.
Controlled Memory Updates: To limit cross-slice interference, PAMT updates a predetermined budget of memory values. This selection is based on decoding-time access statistics, prioritizing entries that are frequently activated in the current session while minimizing updates to those rarely used in prior sessions.

Experimental Results

To validate the efficacy of PAMT, extensive experiments were conducted on well-known datasets, including MS MARCO and Natural Questions. The results demonstrated that PAMT significantly enhances retention of earlier document slices while maintaining retrieval performance for newly added documents. Notably, this approach modifies only a sparse subset of memory values during each session, further underscoring its efficiency.

In conclusion, the introduction of a parametric memory head for continual generative retrieval represents a promising advancement in the field of information retrieval. By addressing the limitations of traditional GenIR models, PAMT not only enhances performance but also preserves valuable knowledge, paving the way for more robust and adaptable retrieval systems in dynamic environments.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Parametric Memory Head Boosts Continual Generative Retrieval

A Parametric Memory Head for Continual Generative Retrieval

The Stability-Plasticity Trade-off

Introducing Post-Adaptation Memory Tuning (PAMT)

Experimental Results

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related