Fast AI Model Partition for Split Learning over Edge Networks
Summary: arXiv:2507.01041v4 Announce Type: replace-cross
Abstract: Split learning (SL) is a distributed learning paradigm that can enable computation-intensive artificial intelligence (AI) applications by partitioning AI models between mobile devices and edge servers. However, the model partitioning problem in SL becomes challenging due to the diverse and complex architectures of AI models. In this paper, we formulate an optimal model partitioning problem to minimize training delay in SL.
Introduction
With the rapid advancement of artificial intelligence technologies, the demand for efficient and effective learning paradigms has surged. Split learning (SL) has emerged as a promising solution, particularly for computation-intensive applications that require the collaboration of mobile devices and edge servers. By partitioning AI models, SL allows for distributed computing, optimizing resource utilization.
The Challenge of Model Partitioning
One of the primary challenges in SL is the model partitioning problem. The diversity and complexity of AI model architectures complicate the process of determining how to effectively split models. The need for an optimal partition that minimizes training delays while maximizing efficiency is paramount. This study addresses this issue by proposing a novel approach to model partitioning.
Methodology
To tackle the model partitioning problem, we represent an arbitrary AI model as a directed acyclic graph (DAG). In this representation:
- Vertices: Represent the layers of the model.
- Edges: Depict the connections between layers.
- Edge Weights: Capture the training delays associated with each connection.
We then propose a general model partitioning algorithm by transforming the problem into a minimum s-t cut problem on the DAG. The theoretical analysis confirms that these two problems are equivalent, enabling us to obtain the optimal model partition through a maximum-flow method.
Block-wise Model Partitioning Algorithm
In addition to the general approach, we take into account AI models with block structures. We designed a low-complexity block-wise model partitioning algorithm to determine the optimal model partition effectively. This algorithm simplifies the DAG by abstracting each block—defined as a repeating component comprising multiple layers—into a single vertex. This abstraction significantly reduces the complexity of the partitioning process.
Experimental Results
We conducted extensive experiments on a hardware testbed equipped with NVIDIA Jetson devices to evaluate the performance of our proposed solution. The results indicate a remarkable reduction in algorithm running time and training delays:
- Algorithm running time reduced by up to 13.0×.
- Training delay minimized by up to 38.95% compared to state-of-the-art baselines.
Conclusion
The advancements presented in this study demonstrate the potential of efficient model partitioning in enhancing the performance of split learning over edge networks. By tackling the inherent challenges associated with diverse AI model architectures, our proposed methodologies pave the way for more effective and efficient distributed learning applications.
