YMIR: Benchmark Dataset & CNN Model for Yemeni Music

YMIR: A New Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks

Summary: arXiv:2604.05011v1 Announce Type: cross

Automatic music genre classification is a significant task within the field of music information retrieval. However, most existing benchmarks and models predominantly cater to Western music, leaving culturally specific traditions, such as Yemeni music, underrepresented. In response to this gap, the research introduces the Yemeni Music Information Retrieval (YMIR) dataset.

About the YMIR Dataset

The YMIR dataset consists of 1,475 meticulously selected audio clips that encompass five traditional Yemeni genres:

Sanaani
Hadhrami
Lahji
Tihami
Adeni

Each audio clip in the dataset was labeled by five Yemeni music experts, utilizing a clear and structured protocol that resulted in strong inter-annotator agreement, indicated by a Fleiss kappa score of 0.85. This robust labeling process underscores the dataset’s reliability and cultural authenticity.

The Yemeni Music Classification Model (YMCM)

Alongside the dataset, the study proposes the Yemeni Music Classification Model (YMCM), which is a convolutional neural network (CNN)-based system specifically designed to classify music genres based on time-frequency features. To ensure consistency and reliability, a systematic preprocessing pipeline was applied throughout the experimental process.

Experimental Setup

The research involved a comprehensive comparison across six experimental groups and five different architectures, culminating in a total of 30 experiments. Various feature representations were evaluated, including:

Mel-spectrograms
Chroma
FilterBank
Mel-frequency cepstral coefficients (MFCCs) with 13, 20, and 40 coefficients

Additionally, the performance of the YMCM was benchmarked against standard models such as AlexNet, VGG16, MobileNet, and a baseline CNN, all under identical experimental conditions. This comprehensive approach allowed for a thorough assessment of model performance across different architectures and feature sets.

Key Findings

The experimental results demonstrated that the Yemeni Music Classification Model (YMCM) is the most effective model, achieving an impressive accuracy rate of 98.8% when utilizing Mel-spectrogram features. Furthermore, the outcomes provide valuable insights into the interplay between feature representation and model capacity, enhancing the understanding of music genre classification in the context of Yemeni traditions.

Conclusion

The findings from this research establish the YMIR dataset as a vital benchmark and the YMCM as a strong baseline for the classification of Yemeni music genres. This initiative not only bridges the gap in music information retrieval for culturally specific traditions but also opens avenues for further research and exploration in the field.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

YMIR: Benchmark Dataset & CNN Model for Yemeni Music

YMIR: A New Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks

About the YMIR Dataset

The Yemeni Music Classification Model (YMCM)

Experimental Setup

Key Findings

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related