Mean MAE with Flow-Mixing for Encrypted Traffic Classification

Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

Summary: arXiv:2603.29537v1 Announce Type: cross

Abstract

Network traffic classification using self-supervised pre-training models based on Masked Autoencoders (MAE) has demonstrated huge potential. However, existing methods are confined to isolated byte-level reconstruction of individual flows, lacking adequate perception of the multi-granularity contextual relationship in traffic. To address this limitation, we propose Mean MAE (MMAE), a teacher-student MAE paradigm with flow mixing strategy for building an encrypted traffic pre-training model.

Introduction

The increasing complexity of network traffic, particularly due to encryption, poses significant challenges for traditional classification methods. The advent of self-supervised learning techniques, particularly those utilizing Masked Autoencoders, has opened new avenues for improving traffic classification. The MMAE model represents a novel approach that enhances the capabilities of existing frameworks.

Methodology

MMAE employs a self-distillation mechanism for teacher-student interaction, where the teacher provides unmasked flow-level semantic supervision to advance the student from local byte reconstruction to multi-granularity comprehension. This shift is crucial for understanding the broader context of network traffic rather than focusing solely on isolated data points.

Flow Mixing Strategy

To break the information bottleneck in individual flows, we introduce a dynamic Flow Mixing (FlowMix) strategy to replace the traditional random masking mechanism. This innovative approach constructs challenging cross-flow mixed samples with interferences, compelling the model to learn discriminative representations from distorted tokens. The FlowMix strategy is pivotal in enhancing the model’s ability to generalize across diverse traffic patterns.

Packet-importance Aware Mask Predictor

Furthermore, we design a Packet-importance aware Mask Predictor (PMP) equipped with an attention bias mechanism. This mechanism leverages packet-level side-channel statistics to dynamically mask tokens with high semantic density, ensuring that the model focuses on the most informative parts of the traffic data.

Results

Numerous experiments conducted on various datasets covering encrypted applications, malware, and attack traffic demonstrate that MMAE achieves state-of-the-art performance. The results indicate significant improvements in classification accuracy and robustness against adversarial attacks, showcasing the effectiveness of our proposed methodologies.

Conclusion

The Mean MAE model represents a significant advancement in the field of encrypted traffic classification. By integrating multi-granularity contextual awareness and innovative flow-mixing strategies, MMAE sets a new benchmark for future research and applications in network traffic analysis.

Code Availability

The code for the Mean MAE model is available at the following link: MMAE GitHub Repository.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Mean MAE with Flow-Mixing for Encrypted Traffic Classification

Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

Abstract

Introduction

Methodology

Flow Mixing Strategy

Packet-importance Aware Mask Predictor

Results

Conclusion

Code Availability

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related