Open Speech and Language Resources



Puebla-Nahuatl

Identifier: SLR92

Summary: Puebla Nahuatl Speech with Transcription

Category: Speech

License: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)

Downloads (use a mirror closer to you):
Puebla-Nahuatl-Manifest.tgz [874M]   (Manifest File of Puebla Nahuatl (e.g., Transcription, Metadata, and Field Recordings) )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part00 [8.4G]   (Part 00 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part01 [8.4G]   (Part 01 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part02 [8.4G]   (Part 02 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part03 [8.4G]   (Part 03 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part04 [8.4G]   (Part 04 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part05 [8.4G]   (Part 05 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part06 [8.4G]   (Part 06 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part07 [8.4G]   (Part 07 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part08 [8.4G]   (Part 08 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
Sound-Files-Puebla-Nahuatl.tgz.part09 [8.4G]   (Part 09 of the speech archive )   Mirrors: [US]   [EU]   [CN]  
SpeechTranslation_Nahuatl_Manifest.tgz [4.2M]   (Transcription and corresponding Spanish translation )   Mirrors: [US]   [EU]   [CN]  

About this resource:

The substantive material of this deposit was gathered over ten years by Jonathan D. Amith (PI) and a team of native speaker colleagues who have participated in the project for many years, one from its inception in 2009.

The following grants supported research that produced the primary material deposited here:
NSF, Documenting Endangered Languages (Award #BCS-1401178), A Biological Approach to Documenting Traditional Ecological Knowledge in Synchronic and Diachronic Perspectives
NEH, Preservation and Access (Award #PD-50031-14), A Biological Approach to Documenting Traditional Ecological Knowledge in Synchronic and Diachronic Perspectives
Comisión Nacional para el Conocimiento y Uso de la Biodiversidad (Award ME010), Floristics, Biodiversity, and Traditional Ecological Knowledge in the Sierra Nororiental of Puebla, Mexico
Endangered Language Documentation Programme, School of Oriental and African Studies (Award MDP0272), Documentation of Nahuat Knowledge of Natural History, Material Culture, and Ecology in the Municipality of Cuetzalan, Puebla.
NSF, Documenting Endangered Languages (Award #0756536), Nahuatl Language Documentation Project: Sierra Norte de Puebla. National Science Foundation, Documenting Endangered Languages ($291,798, Award #0756536)

All material is made available under the Creative Common license CC BY-SA (Attribution-ShareAlike). Please cite or use any material as follows (Corresponding author is Jonathan D.Amith (jonamith@gmail.com)).

Amith, Jonathan D., Amelia Domínguez Alcántara, Hermelindo Salazar Osollo, Ceferino Salgado Castañeda, and Eleuterio Gorostiza Salazar, n.d., Audio corpus of Sierra Nororiental and Sierra Norte de Puebla Nahuat(l) with accompanying time-code transcriptions in ELAN.

For baseline results and corresponding speech recognition/translation corpora, Please cite as follows (Corresponding authors are Jonathan D.Amith (jonamith@gmail.com) and Jiatong Shi (jiatongs@andrew.cmu.edu)):

@inproceedings{shi2021highland,
  title={Highland Puebla Nahuatl speech translation corpus for endangered language documentation},
  author={Shi, Jiatong and Amith, Jonathan D and Chang, Xuankai and Dalmia, Siddharth and Yan, Brian and Watanabe, Shinji},
  booktitle={Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas},
  pages={53--63},
  year={2021}
}