The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems
Landpage: https://github.com/numediart/EmoV-DB
A description of the database here: https://arxiv.org/pdf/1806.09514.pdf
This dataset is built for the purpose of emotional speech synthesis. The transcript were based on the CMU arctic database: http://www.festvox.org/cmu_arctic/cmuarctic.data.
It includes recordings for four speakers- two males and two females.
The emotional styles are neutral, sleepiness, anger, disgust and amused.
Each audio file is recorded in 16bits .wav format
Spk-Je (Female, English: Neutral(417 files), Amused(222 files), Angry(523 files), Sleepy(466 files), Disgust(189 files))
Spk-Bea (Female, English: Neutral(373 files), Amused(309 files), Angry(317 files), Sleepy(520 files), Disgust(347 files))
Spk-Sa (Male, English: Neutral(493 files), Amused(501 files), Angry(468 files), Sleepy(495 files), Disgust(497 files))
Spk-Jsh (Male, English: Neutral(302 files), Amused(298 files), Sleepy(263 files))
File naming (audio_folder): anger_1-28_0011.wav - 1) first word (emotion style), 1-28 - annotation doc file range, Last four digit is the sentence number.
File naming (annotation_folder): anger_1-28.TextGrid - 1) first word (emotional style), 1-28- annotation doc range
You can cite the data using the following BibTeX entry:
@article{adigwe2018emotional, title={The emotional voices database: Towards controlling the emotion dimension in voice generation systems}, author={Adigwe, Adaeze and Tits, No{\'e} and Haddad, Kevin El and Ostadabbas, Sarah and Dutoit, Thierry}, journal={arXiv preprint arXiv:1806.09514}, year={2018} }