Embrace the extraordinary linguistic complexity of Morocco with our Morocco Speech Dataset collection. Spanning Darija (Moroccan Arabic), Modern Standard Arabic, Tamazight (Berber), French, and Spanish — across Morocco’s diverse regions, from the Rif and Atlas mountains to Atlantic coastal cities and Saharan communities — these datasets are built for teams developing voice AI that reflects Morocco’s uniquely multilingual and code-switching reality.

Each recording is sourced from native speakers across varied acoustic environments — from Casablanca’s cosmopolitan streets and Marrakech’s vibrant medinas to rural Amazigh villages, northern Spanish-influenced communities, and broadcast media settings. Meticulously annotated with language and dialect tags, Arabic script transcriptions, Tifinagh markers for Tamazight, and rich speaker demographics, our Moroccan collections are engineered for the phonological complexity and multilingual fluidity that authentic Moroccan voice AI demands.