Mitigating Extrinsic Gender Bias for Bangla Classification Tasks
Summary: arXiv:2411.10636v2 Announce Type: replace-cross
Abstract: In this study, we investigate extrinsic gender bias in Bangla pretrained language models, a largely underexplored area in low-resource languages. To assess this bias, we construct four manually annotated, task-specific benchmark datasets for sentiment analysis, toxicity detection, hate speech detection, and sarcasm detection.
Each dataset is augmented using nuanced gender perturbations, where we systematically swap gendered names and terms while preserving semantic content, enabling minimal-pair evaluation of gender-driven prediction shifts. We then propose RandSymKL, a randomized debiasing strategy integrated with symmetric KL divergence and cross-entropy loss to mitigate the bias across task-specific pretrained models.
Introduction
Extrinsic gender bias in natural language processing (NLP) has gained increasing attention, particularly in high-resource languages. However, the challenge remains significantly underexplored in low-resource languages like Bangla. This study aims to bridge that gap by focusing on Bangla pretrained language models.
Methodology
Our approach involved the development of four benchmark datasets for various classification tasks:
- Sentiment Analysis
- Toxicity Detection
- Hate Speech Detection
- Sarcasm Detection
Each dataset was augmented with gender perturbations. This involved swapping gendered names and terms, allowing for a robust evaluation of how gender influences model predictions while maintaining the original semantic meaning.
Proposed Solution: RandSymKL
To combat the identified biases, we introduced RandSymKL, a novel debiasing strategy. This approach combines:
- Randomized perturbation techniques
- Symmetric KL divergence
- Cross-entropy loss
By integrating these components, RandSymKL offers a unified method for reducing extrinsic gender bias in classification tasks. The methodology was carefully designed to ensure that the accuracy of the models is not compromised while effectively mitigating bias.
Evaluation and Results
The effectiveness of RandSymKL was rigorously tested against existing bias mitigation strategies. Our results demonstrated that:
- RandSymKL significantly reduces extrinsic gender bias.
- The performance metrics remained competitive when compared to baseline models.
This indicates that our proposed strategy not only addresses bias but also preserves the accuracy of the models in classification tasks.
Conclusion and Future Work
The findings of this study contribute to the ongoing discourse on bias in NLP, particularly in low-resource languages. By making our datasets and implementation publicly available at https://github.com/sajib-kumar/Mitigating-Bangla-Extrinsic-Gender-Bias, we aim to encourage further research in this critical area.
