The Waray Speech Dataset provides an extensive repository of authentic audio recordings from native Waray speakers across Eastern Visayas region of Philippines. This specialized linguistic resource contains 107 hours of professionally recorded Waray speech, accurately annotated and organized for sophisticated machine learning tasks.

Waray, spoken by over 3 million people in Samar and Leyte with distinct cultural identity, is documented with its unique phonetic characteristics essential for building effective speech recognition and language processing systems. Available in MP3/WAV format with consistent audio quality, this dataset is specifically designed for AI researchers creating voice applications for Philippine regional languages.

Dataset General Info

Parameter	Details
Size	107 hours
Format	MP3/WAV
Tasks	Speech recognition, AI training, voice assistant development, natural language processing, acoustic modeling, speaker identification
File size	298 MB
Number of files	682 files
Gender of speakers	Female: 46%, Male: 54%
Age of speakers	18-30 years: 29%, 31-40 years: 26%, 40-50 years: 18%, 50+ years: 27%
Countries	Philippines (Eastern Visayas – Samar, Leyte)

Use Cases

Regional Identity Services: Government agencies in Eastern Visayas can utilize the Waray Speech Dataset to develop voice-enabled regional services. Voice technology supports Waray linguistic identity and makes services accessible to Samar and Leyte populations.

Cultural Documentation: Cultural organizations can leverage this dataset to create archives of Waray oral traditions and cultural practices. Voice technology preserves Waray heritage for Eastern Visayas communities.

Disaster Resilience: Organizations can employ this dataset to build emergency communication systems in Waray, crucial for typhoon-prone Eastern Visayas region, delivering critical information during disasters.

FAQ

Q: What is included in the Waray Speech Dataset?

A: The dataset includes 107 hours of audio recordings from native Waray speakers. Contains 682 files in MP3/WAV format, totaling approximately 298 MB, with transcriptions, speaker demographics, and linguistic annotations.

Q: Why is Waray speech technology important?

A: Waray represents a significant linguistic community. Speech technology enables voice interfaces serving this population, supports linguistic rights and cultural preservation, and makes technology accessible in native language.

Q: How diverse is the speaker demographic?

A: Dataset features 46% female and 54% male speakers with age distribution: 29% (18-30), 26% (31-40), 18% (40-50), 27% (50+).

How to Use the Speech Dataset

Step 1: Dataset Acquisition – Download the dataset package from the provided link.

Step 2: Extract and Organize – Extract to your storage and review the structured folder organization.

Step 3: Environment Setup – Install required ML framework dependencies and audio processing libraries.

Step 4: Data Preprocessing – Load audio files and apply preprocessing steps like resampling and feature extraction.

Step 5: Model Training – Split into training/validation/test sets and train your model.

Step 6: Evaluation and Fine-tuning – Evaluate performance and iterate on architecture.

Step 7: Deployment – Export and integrate your trained model into production.

For detailed documentation, refer to the included guides.

SPEECH DATA

Waray Speech Dataset

Dataset General Info

Use Cases

FAQ

How to Use the Speech Dataset

English Speech Dataset

Arabic Speech Dataset

Shona Speech Dataset

Trending

English Speech Dataset

Arabic Speech Dataset

Shona Speech Dataset

Welsh Speech Dataset