A Divide-and-Conquer Strategy for Hard-Label Extraction of Deep Neural Networks via Side-Channel Attacks
Summary: arXiv:2411.10174v2 Announce Type: replace-cross
Abstract
Over the past decade, Deep Neural Networks (DNNs) have demonstrated their effectiveness across a wide range of applications. Despite their significant value and public accessibility, the protection of intellectual property associated with DNNs remains a critical challenge and an emerging area of research. Recent studies have successfully extracted fully-connected DNNs using cryptanalytic techniques in hard-label settings, showcasing the feasibility of duplicating a DNN with high fidelity, which translates to a high degree of similarity in output predictions.
However, existing cryptanalytic attacks have limitations; they primarily target fully connected networks and are constrained to specific neuron configurations found in deep learning architectures. In this paper, we propose a novel end-to-end attack framework aimed at the model extraction of embedded DNNs while ensuring high fidelity in predictions.
Introduction to the Framework
Our proposed method introduces a new black-box side-channel attack that strategically divides the DNN into several linear segments, allowing for the execution of cryptanalytic extraction on each segment. This innovative approach enables the retrieval of weights in hard-label settings efficiently. For the first time, we adapt cryptanalytic extraction techniques for non-fully connected DNNs while maintaining a high level of fidelity in the extracted models.
Methodology
- Implementation of a black-box side-channel attack that segments the DNN.
- Cryptanalytic extraction approach tailored for each linear part of the DNN.
- Validation across various architectures deployed on a microcontroller unit.
Results
We validate our contributions by targeting multiple DNN architectures, including:
- A Multi-Layer Perceptron (MLP) consisting of 1.7 million parameters.
- A shortened version of MobileNetv1.
Our framework successfully extracts these DNNs with remarkable fidelity, achieving 88.4% accuracy for MobileNetv1 and 93.2% accuracy for the MLP. These results demonstrate the effectiveness of our divide-and-conquer strategy in model extraction.
Adversarial Examples Generation
Moreover, we leveraged the stolen models to generate adversarial examples, achieving performance levels close to that of a white-box attack on the victim’s model. The transfer rates were recorded at 95.8% and 96.7%, respectively, indicating the robustness of our extraction method.
Conclusion
The findings presented in this work represent a significant advancement in the field of DNN security. By applying a divide-and-conquer strategy to the extraction of non-fully connected DNNs, we have opened new avenues for research and practice in protecting intellectual property in the realm of artificial intelligence. As DNNs continue to proliferate in various domains, the implications of our research underscore the need for enhanced security measures to safeguard these valuable assets.
