WOWA — The Word Order in Western Asia Corpus
Description
WOWA (Word Order in Western Asia) is an open-access collection of transcribed and annotated spoken texts from 41 languages spoken across a region loosely referred to as Western Asia. Most texts are spontaneous (i.e. unscripted) narrative monologues such as oral history and traditional tales. The languages selected are generally under-researched, non-standardized minority languages, which reflect the long-term linguistic diversity of the region more faithfully than the currently dominant written official languages (Turkish, Arabic, and Persian).
The collection includes original text sources for all languages, and sound files for a large subset. WOWA was designed to investigate areal effects in word order, in particular in the post-predicate domain. The main results of have been published in Haig et al. (2024). For further details and references, please consult the README file included in the archive.
WOWA was funded by the Alexander-von-Humboldt Stiftung (grant number 1135327-IRN-IP), awarded to Geoffrey Haig (Bamberg) and Mohammad Rasekh-Mahand (Hamedan), 2019–2023. The archive was designed and implemented by N. Schiborr.
Archive structure
Each of the 41 data set in WOWA is an individually citable resource that includes minimally source texts (PDF, optionally also WAV audio recordings), original annotations (XLS spreadsheets and TSV), a metadata sheet (PDF), and a citation file (TXT). The contents of the archive are split into into three parts, each wrapped into separate ZIP files:
- 1* wowa__annotations.zip:
- annotation and documentation files for all 41 data sets
- 1* wowa__sources-pdf.zip:
- source texts for all 41 data sets
- 23* wowa_[...]__sources-wav.zip:
- source recordings for the 23 data sets that have them
Citation for the entire WOWA collection
- Haig, Geoffrey & Stilo, Donald & Dogan, Mahîr C. & Schiborr, N. (eds.). 2024. WOWA — Word Order in Western Asia: A spoken-language-based corpus for investigating areal effects in word order variation. Bamberg: University of Bamberg. (DOI: 10.48564/unibafd-gyws0-g4218) (date accessed)
Additionally, each data set in the collection is an individually citable resource with the contributors as authors. Please refer to the citation guides included in the archive with each data set for more information.
List of data sets ([*] = with audio)
- Armenian
- Armenian (Eastern, Agulis) — Katherine Hodgson [*]
- Hellenic
- Pontic Greek (Madan) — Katherine Hodgson [*]
- Pontic Greek (Romeyka) — Laurentia Schreiber
- Indo-Aryan
- Kholosi (Kholos) — Maryam Nourzaei [*]
- Iranian
- Balochi (Coastal) — Maryam Nourzaei [*]
- Balochi (Koroshi) — Maryam Nourzaei [*]
- Balochi (Turkmen) — Geoffrey Haig [*]
- Bashkardi (Northern) — Agnes Korn, Ilya Gershevitch [*]
- Bashkardi (Southern) — Agnes Korn, Ilya Gershevitch [*]
- Gorani (Gawraju) — Masoud Mohammadirad [*]
- Kumzari (Musandam) — Geoffrey Haig
- Kurdish (Central, Sanandaj) — Masoud Mohammadirad [*]
- Kurdish (Northern, Ankara) — Kateryna Iefremenko [*]
- Kurdish (Northern, Lachin) — Donald Stilo
- Kurdish (Northern, Mus) — Geoffrey Haig [*]
- Kurdish (Southern, Bijar) — Masoud Mohammadirad [*]
- Mazandarani (Kordxeyl) — Donald Stilo, Geoffrey Haig
- Persian (New) — Elham Izadi [*]
- Persian (New, Early Classical) — Mehdi Parizadeh
- Talyshi (Lerik) — Donald Stilo
- Tati (Hazarrudi) — Raheleh Izadifar [*]
- Vafsi (Gurchani) — Mahîr Can Dogan [*]
- Zazakî (Çewlîg) — Netîce Demir, Mahîr Dogan [*]
- Zazakî (Siwêreg) — Netîce Demir, Mahîr Dogan [*]
- Kartvelian
- Laz (Arhavi) — Donald Stilo, René Lacroix
- Semitic
- Arabic (Jewish, Baghdad) — Assaf Bar-Moshe, Alexandru Craevschi [*]
- Arabic (Christian, Ka'biye) — Paul Noorlander
- Arabic (Khuzestan) — Bettina Leitner [*]
- Central Neo-Aramaic (Mlahso) — Paul Noorlander
- Central Neo-Aramaic (Turoyo, Midyat) — Paul Noorlander
- NE Neo-Aramaic (Christian, Barwar) — Donald Stilo
- NE Neo-Aramaic (Christian, Shaqlawa) — Paul Noorlander
- NE Neo-Aramaic (Christian, Urmi) — Paul Noorlander
- NE Neo-Aramaic (Jewish, Dohok) — Dorota Molin [*]
- NE Neo-Aramaic (Jewish, Sanandaj) — Paul Noorlander
- NE Neo-Aramaic (Jewish, Urmi) — Paul Noorlander
- Turkic
- Oghuz (Ankara) — Kateryna Iefremenko [*]
- Oghuz (Erzurum) — Mahîr Dogan
- Oghuz (Gagauz) — Mahîr Dogan
- Oghuz (Qashqai) — Sohrab Dolatkhah, Laurentia Schreiber [*]
- Oghuz (Tabriz) — Donald Stilo
Files
wowa__readme.pdf
Files
(7.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:89cee5962504ed8340ebd4c09966fa4f
|
13.7 MB | Preview Download |
|
md5:de67a30a16c1753257ddf89532cf8d13
|
110.5 kB | Preview Download |
|
md5:249d97fdb0423cab8862ef174aa9f2af
|
155.6 MB | Preview Download |
|
md5:9a2d144f8397fcab8da96d766edd437c
|
633.0 MB | Preview Download |
|
md5:de0a004860ca01acefe67f42376459d0
|
330.7 MB | Preview Download |
|
md5:89abeef934baf68310f7ca4d96ccefaa
|
239.3 MB | Preview Download |
|
md5:cb6044392a88ca984cb1e9ee0c9ed820
|
253.4 MB | Preview Download |
|
md5:aa67e0f46a35b5fc8905c061c277256e
|
322.7 MB | Preview Download |
|
md5:a60034170d738220151b0490edbd1748
|
179.0 MB | Preview Download |
|
md5:7c3686b1556d7ce201a65bdb9b4e53b8
|
188.5 MB | Preview Download |
|
md5:4d745d72a5c394b5ccff190a4d877427
|
69.9 MB | Preview Download |
|
md5:5f3cdec9ce2e02e4aaa276e01fd646d9
|
363.0 MB | Preview Download |
|
md5:0a8a5208b4c230b619ba3ff67c7d49af
|
121.2 MB | Preview Download |
|
md5:39ea22f3c5dfb54940546c0742c71789
|
300.9 MB | Preview Download |
|
md5:6365cac2fcd9059aca0725397e91f8bf
|
149.1 MB | Preview Download |
|
md5:631c5ac6a487ce1629fe5bb53c3a48c1
|
567.4 MB | Preview Download |
|
md5:a7e834f23317b8d4c24a16f8d28b45b6
|
603.7 MB | Preview Download |
|
md5:fbbe869244af3345221237d6bd30d7f7
|
396.9 MB | Preview Download |
|
md5:f49fb99bcd26c92d20267c0a3f8c6fe7
|
336.6 MB | Preview Download |
|
md5:75212e40a6ab9cf6548bda21796900a5
|
198.9 MB | Preview Download |
|
md5:0c4572b8f1efc4ee113376e9e38a16d3
|
131.7 MB | Preview Download |
|
md5:82a3ea866f227b5db34ab8d94aaf096b
|
313.6 MB | Preview Download |
|
md5:e582008f74038452e8f332e26bff9649
|
670.4 MB | Preview Download |
|
md5:077368be2f129817c01e38c31a9e669b
|
212.0 MB | Preview Download |
|
md5:ca1fed100b9e388dfe18cb09e0fd5b8e
|
281.9 MB | Preview Download |
|
md5:82cb626e6174bc6e9886493cc8271e7e
|
113.9 MB | Preview Download |