Navigate the extraordinary linguistic diversity of the Middle East with our Middle East Speech Dataset collection. Spanning Arabic, Persian (Farsi, Dari), Turkish, Hebrew, Kurdish, Pashto, Urdu, and Armenian — across more than 20 countries and scores of regional dialects — these datasets are built for teams developing voice AI that reflects the region’s rich, multilingual reality.

Each recording is sourced from native speakers across varied acoustic environments — from ancient bazaars and coastal metropolises to mountainous communities and modern urban centers. Comprehensively annotated with language family tags, dialect markers, script-accurate transcriptions, and speaker demographics, our Middle Eastern collections are engineered for the phonetic complexity and script diversity this region uniquely demands.