If you're not a (corpus) linguist, you might wonder what a "corpus" is, and how it compares to other types of resources. A corpus (plural: corpora) is just a collection of texts that can be used for any type of analysis -- linguistic, sociological, cultural, economic, etc. The most widely used corpora of English are the corpora from English-Corpora.org, of which the Coronavirus Corpus is a part. Corpora are similar to "textual databases" like Lexis-Nexis, but they are different in that corpora typically allow a much large range of queries. For example, well-designed corpora allow you to to the following (see the main page of the corpus for many examples):
When you think about it, the Web is a kind of corpus as well (being a large
"collection of texts"). But it doesn't really allow many of the types of
searches listed above. With Google or Bing or another search engine, you search
for a word or phrase and it simply links to web pages. A corpus allows much more
than this. And for something like research on the coronavirus (COVID-19), it is
much more useful than a simple web search. |