SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation
Summary: arXiv:2603.28824v1
Type: Cross
Introduction
Dataset condensation is a critical technique in machine learning that aims to synthesize smaller yet informative datasets capable of preserving the training performance of larger, full-scale datasets. This method offers significant improvements in computational efficiency, making it a popular choice among researchers and practitioners.
Vulnerabilities in Dataset Condensation
Recent investigations into dataset condensation have uncovered vulnerabilities associated with backdoor attacks. In these attacks, adversaries inject malicious triggers into the condensed datasets, ultimately manipulating the behavior of machine learning models during inference. While previous methods have had some success in balancing the rate of attack success with the accuracy of clean test data, many fail to maintain a level of stealthiness that conceals visual artifacts or perturbations introduced during the inference process.
Introducing Sneakdoor
To tackle the challenge of enhancing stealthiness in backdoor attacks, we present Sneakdoor. This innovative approach not only improves the stealthiness of attacks but also ensures that their effectiveness remains intact. Sneakdoor leverages the inherent vulnerabilities present in class decision boundaries, integrating a generative module that creates input-aware triggers that align with local feature geometry. This strategic design minimizes the likelihood of detection.
Key Features of Sneakdoor
- Enhanced Stealthiness: The Sneakdoor method is designed to remain imperceptible to both human inspection and statistical detection methods.
- Effective Trigger Generation: By constructing triggers that are informed by input data and local features, Sneakdoor reduces the visibility of manipulations.
- Balanced Performance: Extensive experiments conducted across various datasets have shown that Sneakdoor achieves an impressive equilibrium between attack success rate, clean test accuracy, and stealthiness.
Experimental Results
Our rigorous testing across multiple datasets indicates that Sneakdoor significantly enhances the invisibility of both synthetic data and triggered samples while preserving high efficacy in attacks. The results illustrate a marked improvement in the approach’s performance compared to existing methods, establishing Sneakdoor as a noteworthy advancement in the field of dataset condensation and backdoor attack strategies.
Conclusion
In conclusion, Sneakdoor represents a significant leap forward in the realm of stealthy backdoor attacks within dataset condensation. Its innovative design not only addresses the pressing need for enhanced stealth but also maintains the critical effectiveness of such attacks. For researchers and practitioners interested in further exploring this approach, the code is available at https://github.com/XJTU-AI-Lab/SneakDoor.
