This is a list of words in Spanish with frequency counts.
This data was derived from the LDC Spanish Gigaword Corpus (LDC2011T12). The list is used as a part of the Kaldi Spanish Fisher recipe and is used to augment the pronunciation lexicon with additional words. The actual pronunication is generated using the Spanish rule based lexicon (LDC96L16).
NOTE : No components of the LDC datasets LDC2011T12 and LDC96L16 are included with this dataset.
Details of how this word list is used can be found in this paper : http://cs.jhu.edu/~gkumar/papers/kumar2014some.pdf