The corpora at this site were created by Mark Davies, (retired) Professor of Linguistics (CV, Google Scholar). The corpora were formerly known as the BYU Corpora, and they are probably the most widely-used corpora currently available.

The corpora have many different uses, including:

  • finding out how native speakers actually speak and write

  • finding the frequency of words, phrases, and collocates

  • looking at language variation and change; e.g. historical, dialects, and genres

  • gaining insight into culture; for example what is said about different concepts over time and in different countries

  • designing authentic language teaching materials and resources.

In addition to the 17 corpora (and the Google Books (Advanced) interface), there are also many corpus-based resources. These allow you to: