()   

 

  Download the corpus for offline use

The SOAP corpus contains 100 million words of data from 22,000 transcripts from American soap operas from the early 2000s, and it serves as a great resource to look at very informal language.

The corpus is related to many other corpora of English that we have created. These corpora were formerly known as the "BYU Corpora"), and they offer unparalleled insight into variation in English.

Click on any of the links in the search form on the search page for context-sensitive help, and to see the range of queries that the corpus offers. You might pay special attention to virtual corpora, which allow you to create personalized collections of texts related to a particular area of interest.