SE-Enhanced ViT and BiLSTM-Based Intrusion Detection for Secure IIoT and IoMT Environments
With the rapid growth of interconnected devices in Industrial and Medical Internet of Things (IIoT and MIoT) ecosystems, ensuring timely and accurate detection of cyber threats has become a critical challenge. This study presents an advanced intrusion detection framework based on a hybrid Squeeze-and-Excitation Attention Vision Transformer-Bidirectional Long Short-Term Memory (SE ViT-BiLSTM) architecture.
Abstract
The proposed design replaces the traditional multi-head attention mechanism of the Vision Transformer with Squeeze-and-Excitation attention, integrating it with BiLSTM layers to enhance detection accuracy and computational efficiency. This innovative approach aims to significantly improve the reliability of intrusion detection systems deployed in IIoT and MIoT environments.
Methodology
The SE ViT-BiLSTM model was trained and evaluated on two real-world benchmark datasets:
- EdgeIIoT
- CICIoMT2024
The evaluation was conducted before and after data balancing using the Synthetic Minority Over-sampling Technique (SMOTE) and RandomOverSampler. This balancing technique plays a crucial role in improving the model’s performance by addressing class imbalance issues commonly found in cybersecurity datasets.
Results
Experimental results demonstrate that the SE ViT-BiLSTM model outperforms existing approaches across multiple metrics. The performance before data balancing showed:
- EdgeIIoT: 99.11% accuracy (FPR: 0.0013%, latency: 0.00032 sec/inst)
- CICIoMT2024: 96.10% accuracy (FPR: 0.0036%, latency: 0.00053 sec/inst)
After implementing data balancing techniques, the model’s performance improved even further, achieving:
- EdgeIIoT: 99.33% accuracy with 0.00035 sec/inst latency
- CICIoMT2024: 98.16% accuracy with 0.00014 sec/inst latency
Conclusion
This study highlights the effectiveness of the SE ViT-BiLSTM model in addressing the pressing challenges of intrusion detection in IIoT and MIoT environments. The combination of Squeeze-and-Excitation attention with BiLSTM layers not only enhances detection accuracy but also optimizes computational efficiency, making the proposed framework a promising solution for securing critical industrial and medical systems against cyber threats.
Future Work
Future research will focus on further optimizing the model and exploring its applicability to a broader range of cybersecurity challenges. The aim is to continually refine the intrusion detection capabilities to keep pace with the evolving landscape of cyber threats in interconnected environments.
