Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan
Language endangerment poses a major challenge to linguistic diversity worldwide, and technological advances have opened new avenues for documentation and revitalization. Among these, automatic speech recognition (ASR) has shown increasing potential to assist in the transcription of endangered language data. This study focuses on Ikema, a severely endangered Ryukyuan language spoken in Okinawa, Japan, with approximately 1,300 remaining speakers, most of whom are over 60 years old.
Research Objectives
This study presents an ongoing effort to develop an ASR system for Ikema based on field recordings. The primary objectives of this research include:
- Constructing a comprehensive speech corpus from field recordings.
- Training an ASR model to achieve a low character error rate.
- Evaluating the impact of ASR assistance on speech transcription efficiency.
Methodology
To achieve these objectives, the research team undertook the following steps:
- Speech Corpus Construction: A total of {totaldatasethours} hours of speech data were collected from field recordings of native Ikema speakers. This corpus serves as the foundational dataset for training the ASR model.
- ASR Model Training: The collected speech data were used to train an ASR model, which achieved a character error rate as low as 15%. This rate indicates the model’s effectiveness in recognizing and transcribing spoken Ikema accurately.
- Efficiency Evaluation: The research team evaluated the impact of ASR assistance on transcription tasks, measuring both the time taken to transcribe speech and the cognitive load experienced by transcribers.
Findings
The integration of ASR technology into the transcription process of Ikema has yielded promising results. Key findings from the study include:
- The ASR model’s character error rate of 15% demonstrates its potential for effective transcription of endangered languages.
- Transcription time was significantly reduced when ASR assistance was utilized, allowing researchers to focus on more complex linguistic analysis.
- Cognitive load, measured through qualitative feedback from transcribers, was notably lower when using ASR, suggesting that technology can alleviate some burdens of manual transcription.
Conclusion
This study highlights the transformative role that automatic speech recognition can play in the documentation and revitalization of endangered languages like Ikema Miyakoan. By facilitating faster transcription and reducing cognitive demands, ASR serves as a viable tool for linguists and researchers engaged in the preservation of linguistic diversity. As the development of ASR technology continues to advance, its applications could expand to other endangered languages, offering scalable solutions for their documentation and revitalization.
