Unlock the linguistic richness of Latin America with our comprehensive speech dataset collection. Spanning 18 countries and 12 regional Spanish dialects — plus Portuguese, indigenous languages, and creole variants — these datasets are built for teams training voice models that actually reflect the region’s diversity.

Each dataset is professionally recorded and rigorously annotated, covering a wide range of acoustic environments, speaker demographics, age groups, and speaking styles. Whether you’re building ASR systems, TTS engines, speaker identification tools, or conversational AI, our Latin American collections deliver the depth and variety your model demands.

Latin America Speech Datasets

19 December 2025

Spanish Speech Dataset

speech_data_
12 December 2025

Portuguese Speech Dataset

speech_data_
12 December 2025

Aymara Speech Dataset

speech_data_
12 December 2025

Quechua Speech Dataset

speech_data_
12 December 2025

Guaraní Speech Dataset

speech_data_

Latin America Speech Datasets

Spanish Speech Dataset

Portuguese Speech Dataset

Aymara Speech Dataset

Quechua Speech Dataset

Guaraní Speech Dataset