Attention Flows: Tracing LLM Conceptual Engagement via Story Summaries
Summary: arXiv:2604.06416v1 Announce Type: cross
Recent advancements in large language models (LLMs) have significantly increased their context lengths; however, emerging evidence suggests that their capacity to integrate and comprehend information from long-form texts remains limited. This article evaluates a specific understanding task—namely, the generation of summaries for novels. Summaries crafted by human authors serve as a window into what they perceive as narratively significant elements within the story. By contrasting these human-authored summaries with those generated by LLMs, researchers can determine if these models replicate human conceptual engagement patterns.
Research Objective
The primary objective of this research is to assess the conceptual engagement of both humans and LLMs when summarizing narratives. To achieve this, the study involves aligning sentences from 150 human-written novel summaries to the specific chapters they reference. This alignment task proves to be complex, highlighting the inherent challenges associated with summarization as a cognitive task.
Methodology
The study employs a rigorous methodology to evaluate the summarization capabilities of both human authors and nine state-of-the-art LLMs. The steps involved include:
- Data Collection: The research utilizes a dataset of 150 novel summaries written by human authors.
- Sentence Alignment: Researchers align sentences from the human summaries with their corresponding chapters, providing a framework for comparison.
- Model Generation: Summaries are generated by nine advanced LLMs for each of the 150 reference texts.
- Comparative Analysis: The generated summaries are compared to the human-authored versions to identify stylistic differences and focus distribution throughout the narratives.
Findings
The findings of the study reveal significant differences between human and LLM-authored summaries. Key observations include:
- Stylistic Variations: There are notable stylistic differences evident in the summaries produced by LLMs compared to those written by humans.
- Focus Distribution: LLMs tend to place greater emphasis on the endings of texts, in contrast to the more balanced focus displayed by human authors across the narrative.
- Narrative Engagement: The investigation into human narrative engagement and model attention mechanisms provides insights into the observed decline in narrative comprehension among LLMs.
Conclusion and Future Directions
The study concludes that while LLMs have made impressive strides in natural language processing, their approach to summarizing complex narratives differs significantly from human engagement patterns. This discrepancy raises questions about the underlying mechanisms driving LLM attention and comprehension. Furthermore, the research highlights potential targets for future development aimed at enhancing narrative comprehension in LLMs.
To facilitate ongoing research in this area, the dataset employed in this study has been publicly released, serving as a valuable resource for scholars and developers seeking to explore the nuances of narrative engagement in artificial intelligence.
