SyriSign: A Parallel Corpus for Arabic Text to Syrian Arabic Sign Language Translation
Summary: arXiv:2603.29219v1 Announce Type: cross
Introduction
Sign language is an essential mode of communication for the Deaf and Hard-of-Hearing (DHH) community. Despite the existence of numerous benchmarks for high-resource sign languages, there remains a significant gap in the resources available for low-resource languages, particularly Arabic. This underrepresentation poses challenges for effective communication within the DHH community in Arabic-speaking regions.
Introducing SyriSign
To address this challenge, we introduce SyriSign, a groundbreaking dataset composed of 1500 video samples that encompass 150 unique lexical signs specifically designed for text-to-Syrian Arabic Sign Language (SyArSL) translation tasks. The primary objective of this initiative is to reduce communication barriers in Syria, where news and information are predominantly conveyed in spoken or written Arabic, thereby limiting accessibility for the deaf community.
Dataset Description
SyriSign serves as a pioneering resource in the field of sign language translation and includes:
- 1500 video samples
- 150 unique lexical signs
- Focus on text-to-SyArSL translation
The dataset is designed to facilitate research and development in the area of sign language processing and aims to serve as an initial benchmark for future studies.
Evaluation of SyriSign
We conducted an evaluation of the SyriSign dataset utilizing three advanced deep learning architectures:
- MotionCLIP: For semantic motion generation.
- T2M-GPT: For text-conditioned motion synthesis.
- SignCLIP: For bilingual embedding alignment.
Experimental results revealed that while generative approaches display strong potential for sign representation, the limited size of the dataset poses constraints on the generalization performance of the models.
Conclusion and Future Directions
We are committed to releasing the SyriSign dataset publicly, with the hope that it will not only serve as a valuable resource for researchers but also contribute to bridging the communication gap for the DHH community in Syria. By making this dataset available, we aim to encourage further research and development in this crucial area, fostering greater accessibility and inclusion for individuals who rely on sign language for communication.
