Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation
Summary: arXiv:2604.02557v1 Announce Type: cross
Abstract: Language models are known to exhibit various forms of cultural bias in decision-making tasks, yet much less is known about their degree of cultural familiarity in open-ended text generation tasks. In this paper, we introduce the task of culturally-adapted art description generation, where models describe artworks for audiences from different cultural groups who vary in their familiarity with the cultural symbols and narratives embedded in the artwork. To evaluate cultural competence in this pragmatic generation task, we propose a framework based on culturally grounded question answering. We find that base models are only marginally adequate for this task, but, through a pragmatic speaker model, we can improve simulated listener comprehension by up to 8.2%. A human study further confirms that the model with higher pragmatic competence is rated as more helpful for comprehension by 8.0%.
Introduction
As artificial intelligence continues to evolve, the intersection of pragmatics and culture becomes increasingly important, particularly in the realm of natural language generation. This study addresses the challenges faced by language models when tasked with generating descriptions of artwork that resonate with diverse cultural audiences.
The Challenge of Cultural Bias
Language models often reflect the cultural biases present in their training data. When it comes to generating text, particularly in open-ended contexts like art description, these biases can lead to misunderstandings and misinterpretations. The challenge lies in creating a model that is not only aware of these biases but can also adapt its output to suit different cultural contexts.
Culturally-adapted Artwork Description Generation
The task of culturally-adapted artwork description generation involves creating nuanced descriptions that take into account the cultural backgrounds of the intended audience. This requires models to identify and articulate cultural symbols and narratives embedded in the artwork, making the descriptions accessible and meaningful to diverse groups.
Evaluating Cultural Competence
To assess the cultural competence of language models in this context, the researchers introduced a novel evaluation framework based on culturally grounded question answering. This approach not only measures the accuracy of the generated descriptions but also evaluates the effectiveness of communication between the speaker (the model) and the listener (the audience).
Findings and Improvements
The study revealed that base models performed inadequately in generating culturally relevant descriptions. However, by implementing a pragmatic speaker model, the researchers observed a significant improvement in listener comprehension, with an increase of up to 8.2%. This enhancement underscores the importance of pragmatic competence in the generation of culturally-sensitive content.
Human Evaluation
A subsequent human study corroborated the findings, indicating that models with higher pragmatic competence were rated as 8.0% more helpful for comprehension. This highlights the necessity of integrating cultural understanding into AI systems, particularly in creative domains such as art.
Conclusion
The intersection of pragmatics and cultural understanding in AI-generated content is a vital area of research. By focusing on culturally-adapted art description generation, this study contributes to the development of more effective and inclusive language models. As AI continues to play a role in creative fields, addressing cultural bias and enhancing cultural competence will be essential for fostering meaningful interactions between technology and diverse human audiences.
