Efficient Semantic Image Communication for Traffic Monitoring at the Edge
Summary: arXiv:2604.12622v1 Announce Type: cross
Abstract
Many visual monitoring systems operate under strict communication constraints, where transmitting
full-resolution images is impractical and often unnecessary. In such settings, visual data is often
used for object presence, spatial relationships, and scene context rather than exact pixel fidelity.
This paper presents two semantic image communication pipelines for traffic monitoring, MMSD and SAMR,
that reduce transmission cost while preserving meaningful visual information.
Semantic Image Communication Pipelines
The two proposed pipelines, MMSD (Multi-Modal Semantic Decomposition) and SAMR (Semantic-Aware Masking Reconstruction),
offer innovative solutions for traffic monitoring.
-
MMSD (Multi-Modal Semantic Decomposition):
This pipeline focuses on achieving very high compression levels while ensuring data confidentiality.
Instead of transmitting the original image, it replaces it with compact semantic representations,
including segmentation maps, edge maps, and textual descriptions. Reconstruction at the receiver
is performed using a diffusion-based generative model. -
SAMR (Semantic-Aware Masking Reconstruction):
This approach aims for higher visual quality while maintaining strong compression. SAMR selectively
suppresses non-critical image regions based on semantic importance before applying standard JPEG encoding.
It then restores the missing content at the receiver through generative inpainting.
Asymmetric Sender-Receiver Architecture
Both MMSD and SAMR follow an asymmetric sender-receiver architecture, where lightweight processing
is done at the edge (e.g., on devices like Raspberry Pi 5) and computationally intensive reconstruction
tasks are offloaded to a remote server. The processing times observed are approximately 15 seconds for
MMSD and 9 seconds for SAMR, demonstrating the efficiency of this architecture.
Experimental Results
Experimental evaluations reveal substantial reductions in transmitted data volume, with averages of
99% for MMSD and 99.1% for SAMR. Notably, MMSD achieves a lower payload size than the recent SPIC baseline
while maintaining strong semantic consistency. On the other hand, SAMR offers a superior quality-compression
trade-off compared to traditional standards like JPEG and SQ-GAN under similar operating conditions.
Conclusion
The development of MMSD and SAMR pipelines represents a significant advancement in the field of semantic
image communication, particularly for traffic monitoring applications. By focusing on meaningful visual
information and optimizing transmission efficiency, these methods pave the way for more effective and
secure visual monitoring systems operating at the edge.
