This dataset contains 10,083 recorded utterances in French, Maninka, Pular and Susu from 49 speakers (16 female and 33 male) ranging from 5 to 76 years old on a variety of devices.
Please see our paper for more details on this dataset. Additional resources can be found in the following git repository: https://github.com/mdoumbouya/nicolingua
You can cite our work using the following BibTeX entry.
@inproceedings{doumbouya2021usingradio, title={Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate Users}, author={Doumbouya, Moussa and Einstein, Lisa and Piech, Chris}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={35}, year={2021} }