HamBam — The Hamedan-Bamberg Corpus of Contemporary Spoken Persian
Creators
Description
HamBam, the Hamedan-Bamberg Corpus of Contemporary Spoken Persian (Haig & Rasekh-Mahand 2022), is an unrestrictedly accessible online corpus of contemporary spoken Persian. The design of the corpus follows the architecture and rationale of Multi-CAST (Haig & Schnell 2015), but with certain modifications. As in Multi-CAST, the texts are annotated using the free annotation software ELAN, which links sound files to annotation files. The annotated data are available in various formats (sound files, ELAN annotation files, tab-separated value files, and XML). This archive contains version 3.0 of the corpus (published in October 2025), which has been edited and expanded with six additional recordings. It fully supersedes all earlier versions.
HamBam at a glance
- number of individual recordings: 44
- total runtime: 166 minutes
- total grammatical words: 20000
The HamBam team
- Geoffrey Haig
- Mohammad Rasekh-Mahand
- Elham Izadi
- Fariba Sabouri
- Maryam Pouyankhah
- Iran Abdi
- Mehdi Parizadeh
- Mehrdad Meshkinfam
- Laurentia Schreiber
- N. Schiborr
Citation
- Haig, Geoffrey & Rasekh-Mahand, Mohammad. 2022. HamBam: Hamedan-Bamberg Corpus of Contemporary Spoken Persian. Version 3.0. (DOI: 10.48564/unibafd-v80bg-h0243)
Files
hambam_corpus-description.pdf
Files
(1.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:99609df7e0fdda95ce42be4909b6b39c
|
2.6 MB | Preview Download |
|
md5:f3cfae6ed48d1f5f76b306cb59f60ae2
|
197.8 kB | Preview Download |
|
md5:c6f554ee725ec010db24e4de1152483d
|
34.0 kB | Preview Download |
|
md5:9d2b1597742fb822dba53d3fb007955c
|
4.6 kB | Download |
|
md5:8db25485206e4402ee56519ecc90f3bd
|
153.9 MB | Preview Download |
|
md5:e27dc68e868d66decae8b8d7385f39ae
|
974.0 MB | Preview Download |
|
md5:09a3f24abbe1894f6131917b8bf53b49
|
321 Bytes | Preview Download |
|
md5:ec4f01f6059cb5b84f359732b579a489
|
1.2 kB | Preview Download |