This is a mirror of the Santa Barbara Corpus of Spoken American English (SBCSAE), a corpus compiled by researchers in the Linguistics Department of the University of California, Santa Barbara.

SBCSAE consists of 60 roughly 20-minute stereo recordings of naturally-occurring spoken interactions recorded across the United States, reflecting a wide variety of speaker demographics and conversational dynamics, released at 22.05 kHz. The corpus includes transcriptions and time marks at the level of individual intonation units, as well as additional metadata about the participants and recording context.

When using this corpus, please cite the following:

@misc{dubois_2005,
  author={Du Bois, John W. and Chafe, Wallace L. and Meyer, Charles and Thompson, Sandra A. and Englebretson, Robert and Martey, Nii},
  year={2000--2005},
  title={{S}anta {B}arbara corpus of spoken {A}merican {E}nglish, {P}arts 1--4},
  address={Philadelphia},
  organization={Linguistic Data Consortium},
}