The corpora at this site were created by Mark Davies, (retired) Professor of Linguistics (CV, Google Scholar). The corpora were formerly known as the BYU Corpora, and they are probably the most widely-used corpora currently available.
The corpora have many different uses, including:
finding out how native speakers actually speak and write
finding the frequency of words, phrases, and collocates
looking at language variation and change; e.g. historical, dialects, and genres
gaining insight into culture; for example what is said about different concepts over time and in different countries
designing authentic language teaching materials and resources.
In addition to the sixteen corpora (and the Google Books (Advanced) interface), there are also many corpus-based resources. These allow you to:
See detailed entries for the top 60,000 words in English (definitions, genre variation, collocates, concordance lines, synonyms) -- all on one page
Enter and analyze your own text, find keywords from your text, compare phrases to COCA, and see detailed information (see above) for each word
Get detailed information from the Academic Vocabulary List (including detailed information on each word, and analyzing your own academic texts)
Download large amounts of corpus-based data, including word frequency, collocates, and n-grams
Download the entire corpus for offline use (iWeb, COCA, COHA, GloWbE, NOW, SOAP, TV, Movies, Wikipedia, Spanish)