Capture the full linguistic depth of Iran with our Iran Speech Dataset collection. Spanning Persian (Farsi), Azerbaijani, Kurdish, Gilaki, Mazandarani, Luri, Balochi, and Turkmen — across urban centers, regional provinces, and diaspora communities — these datasets are built for teams developing voice AI that reflects Iran’s rich, multilingual and multi-dialectal reality.
Each recording is sourced from native speakers across varied acoustic environments — from Tehran’s bustling metropolitan soundscape and Caspian coastal communities to Kurdish highlands, desert cities, and spontaneous conversational settings. Meticulously annotated with dialect markers, Persian script transcriptions, phonetic alignments, and speaker demographics, our Iranian collections are engineered for the phonological richness and script complexity that high-performance Persian voice AI demands.




