SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization
Source: arXiv:2604.19185v1
Type: Cross
Abstract
Small language models (SLMs), such as BART, have shown to achieve summarization performance comparable to larger language models (LLMs) through a process known as distillation. However, current LLM-based ranking strategies for summary candidates present significant issues related to instability. Furthermore, traditional metrics such as ROUGE have proven inadequate for accurately ranking high-quality summaries. To combat these challenges, we present SCURank, a novel framework that improves summarization by employing Summary Content Units (SCUs).
The SCURank Framework
SCURank diverges from conventional evaluation methods that depend on unstable comparisons or superficial overlaps. Instead, it emphasizes the richness and semantic importance of the information contained in summaries. By focusing on the content’s significance, SCURank offers a more reliable and informative assessment of summary quality.
Research Findings
Our investigation into the efficacy of SCURank involved distilling summaries from various diverse LLMs. The experimental results indicate that SCURank consistently outperforms traditional metrics and LLM-based ranking methods across multiple evaluation measures and datasets. The key findings include:
- Improved Ranking: SCURank effectively ranks summaries based on deeply informative content rather than mere surface-level attributes.
- Enhanced Abstractiveness: By incorporating diverse LLM summaries, SCURank enhances the abstractiveness of the resulting model.
- Overall Performance: The distilled model’s performance is significantly improved, validating the effectiveness of an information-centric approach in multi-LLM distillation.
Conclusion
The introduction of SCURank marks a significant advancement in summarization technology, addressing the shortcomings of existing LLM-based ranking strategies through a focus on content richness and semantic relevance. As the demand for high-quality summarization continues to grow, frameworks like SCURank will play a crucial role in ensuring that generated summaries meet the increasing expectations for clarity, conciseness, and informativeness.
Accessing SCURank
For practitioners and researchers interested in exploring SCURank further, the code is available at the following link: SCURank GitHub Repository.
