Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning
Summary: arXiv:2604.18862v1 Announce Type: cross
Abstract: Bug reports, encompassing a wide range of bug types, are crucial for maintaining software quality. However, the increasing complexity and volume of bug reports pose a significant challenge in sole manual identification and assignment to the appropriate teams for resolution, as dealing with all the reports is time-consuming and resource-intensive.
In this paper, we introduce a cross-project framework, dubbed Mutualistic Neural Active Learning (MNAL), designed for automated and more effective identification of bug reports from GitHub repositories boosted by human-machine collaboration. MNAL utilizes a neural language model that learns and generalizes reports across different projects, coupled with active learning to form neural active learning.
Key Features of MNAL
- Mutualistic Collaboration: A distinctive feature of MNAL is the purposely crafted mutualistic relation between the machine learners (neural language model) and human labelers (developers) when enriching the knowledge learned.
- Efficient Report Identification: The model uses the most informative human-labeled reports and their corresponding pseudo-labeled ones to update the model, ensuring that the reports needing developer attention are more readable and identifiable.
- Model-Agnostic Approach: MNAL is capable of improving model performance with various underlying neural language models, making it adaptable to different scenarios.
Evaluation and Results
We evaluate MNAL using a large-scale dataset against state-of-the-art (SOTA) approaches, baselines, and different variants. The results indicate that MNAL achieves up to:
- 95.8% effort reduction in terms of readability during human labeling.
- 196.0% effort reduction in identifiability during human labeling.
Moreover, MNAL results in improved performance in bug report identification, demonstrating its efficacy in a practical setting.
Qualitative Case Study
To further verify the efficacy of our approach, we conducted a qualitative case study involving 10 human participants. Participants rated MNAL as being more effective while saving more time and monetary resources. This feedback underscores the importance of human-machine collaboration in enhancing the software development process.
Conclusion
The Mutualistic Neural Active Learning framework presents a promising solution to the challenges posed by the increasing complexity of bug report identification and resolution. By fostering a collaborative environment between human developers and machine learning models, MNAL not only streamlines the bug identification process but also enhances the overall quality of software maintenance.
As we continue to refine and develop this framework, we anticipate further advancements in the automation of software quality control processes, ultimately leading to more efficient and effective software development practices.
