()   

 

  Download the corpus for offline use

The Coronavirus Corpus is designed to be the definitive record of the social, cultural, and economic impact of the coronavirus (COVID-19) in 2020 and beyond.

Unlike resources like Google Trends (which just show what people are searching for), the corpus shows what people are actually saying in online newspapers and magazines in 20 different English-speaking countries.

The corpus (which was first released in May 2020) is currently about 1559 million words in size, and it continues to grow by 3-4 million words each day.

The Coronavirus Corpus allows you to see the frequency of words and phrases in 10-day increments since Jan 2020, such as social distancing, flatten the curve, WORK * home, Zoom, Wuhan, hoard*, toilet paper, curbside, pandemic, reopen, defy, anti-mask*.

You can also look at "collocates" (nearby words) to see what is being said about a certain topic, such as (verbs near) virus, or any word near ban (v), stockpile, disinfect*, or remotely. And you can even see the collocates of a word in each 10-day period since Jan 2020 (e.g. stockpile).

The corpus also allows you to see the patterns in which a word occurs, as with stay-at-home, social, economic, or hoard*.

You can also compare between different time periods, to see how our view of things have changed over time. (And you can even compare between the 20 countries in the corpus). Interesting comparisons over time might include phrases like social * or economic * that were more common in Jan/Feb than in Apr/May, or words near BAN or OBEY that were more common in Apr-May than in Jan-Feb.

Click on any of the links in the search form on the search page for context-sensitive help, and to see the range of queries that the corpus offers (LIST discusses the search syntax). You might pay special attention to the virtual corpora, which allow you to create personalized collections of texts related to a particular area of interest.

Finally, the corpus is related to many other corpora of English that we have created. These corpora were formerly known as the "BYU Corpora"), and they offer unparalleled insight into variation in English.