YMIR: Benchmark Dataset & CNN Model for Yemeni Music

Date:

YMIR: A New Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks

Summary: arXiv:2604.05011v1 Announce Type: cross

Automatic music genre classification is a significant task within the field of music information retrieval. However, most existing benchmarks and models predominantly cater to Western music, leaving culturally specific traditions, such as Yemeni music, underrepresented. In response to this gap, the research introduces the Yemeni Music Information Retrieval (YMIR) dataset.

About the YMIR Dataset

The YMIR dataset consists of 1,475 meticulously selected audio clips that encompass five traditional Yemeni genres:

  • Sanaani
  • Hadhrami
  • Lahji
  • Tihami
  • Adeni

Each audio clip in the dataset was labeled by five Yemeni music experts, utilizing a clear and structured protocol that resulted in strong inter-annotator agreement, indicated by a Fleiss kappa score of 0.85. This robust labeling process underscores the dataset’s reliability and cultural authenticity.

The Yemeni Music Classification Model (YMCM)

Alongside the dataset, the study proposes the Yemeni Music Classification Model (YMCM), which is a convolutional neural network (CNN)-based system specifically designed to classify music genres based on time-frequency features. To ensure consistency and reliability, a systematic preprocessing pipeline was applied throughout the experimental process.

Experimental Setup

The research involved a comprehensive comparison across six experimental groups and five different architectures, culminating in a total of 30 experiments. Various feature representations were evaluated, including:

  • Mel-spectrograms
  • Chroma
  • FilterBank
  • Mel-frequency cepstral coefficients (MFCCs) with 13, 20, and 40 coefficients

Additionally, the performance of the YMCM was benchmarked against standard models such as AlexNet, VGG16, MobileNet, and a baseline CNN, all under identical experimental conditions. This comprehensive approach allowed for a thorough assessment of model performance across different architectures and feature sets.

Key Findings

The experimental results demonstrated that the Yemeni Music Classification Model (YMCM) is the most effective model, achieving an impressive accuracy rate of 98.8% when utilizing Mel-spectrogram features. Furthermore, the outcomes provide valuable insights into the interplay between feature representation and model capacity, enhancing the understanding of music genre classification in the context of Yemeni traditions.

Conclusion

The findings from this research establish the YMIR dataset as a vital benchmark and the YMCM as a strong baseline for the classification of Yemeni music genres. This initiative not only bridges the gap in music information retrieval for culturally specific traditions but also opens avenues for further research and exploration in the field.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.