Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models
Summary: arXiv:2604.02490v1 Announce Type: cross
Abstract
Malware family classification remains a challenging task in automated malware analysis, particularly in real-world settings characterized by obfuscation, packing, and rapidly evolving threats. Existing machine learning and deep learning approaches typically depend on labeled datasets, handcrafted features, supervised training, or dynamic analysis, which limits their scalability and effectiveness in open-world scenarios.
Introduction
The landscape of cybersecurity is continuously changing, with malware becoming increasingly sophisticated. Traditional methods of malware classification often fall short due to the reliance on extensive labeled datasets and the need for constant retraining of models. To address these limitations, a novel framework has been proposed that utilizes a weighted hierarchical ensemble of pretrained large language models (LLMs) for zero-label malware family classification.
Methodology
The proposed framework does not depend on feature-level learning or model retraining. Instead, it aggregates decision-level predictions from multiple LLMs, leveraging their complementary reasoning strengths. The methodology consists of several key components:
- Weighted Model Outputs: Each model’s output is weighted according to empirically derived macro-F1 scores, ensuring that predictions with higher accuracy have a greater influence on the final classification.
- Hierarchical Organization: The decision-making process is structured hierarchically, first addressing coarse-grained malicious behavior before narrowing down to fine-grained malware families.
- Robustness and Stability: This hierarchical framework enhances the robustness of the classification process and reduces the instability commonly associated with individual models.
- Analyst-style Reasoning: The proposed method aligns with the reasoning patterns of cybersecurity analysts, facilitating more intuitive and effective decision-making.
Results
Preliminary experiments demonstrate that this zero-label classification framework significantly outperforms traditional approaches in various metrics, including accuracy and F1 score. The ability to classify malware families without extensive labeled data represents a significant advancement in automated malware analysis.
Conclusion
The innovative use of weighted hierarchical ensembles of large language models presents a promising solution to the challenges of malware family classification in open-world scenarios. By moving away from traditional dependency on labeled datasets, this approach enhances scalability and effectiveness, making it a vital tool in the ever-evolving field of cybersecurity. Future work will focus on further refining the model and exploring its applicability across different types of malware and threat landscapes.
Implications for Future Research
As malware continues to evolve, the need for robust, scalable classification methods will only grow. This research opens the door for further exploration into the integration of LLMs in various cybersecurity applications, potentially leading to more advanced and adaptive threat detection systems.
