()   

 

Integrated AI features: free with a premium or academic license.

[ Sample searches | Get started ]

The Coronavirus Corpus was created by Mark Davies. It contains about 1.5 billion words of data in approximately 1.9 million texts from Jan 2020 - Dec 2022, and it is designed to be the definitive record of the social, cultural, and economic impact of the coronavirus (COVID-19) pandemic during this time.

The corpus allows you to see the frequency of words and phrases month by month and even day by day since January 2020, such as social distancing, flatten the curve, WORK * home, Zoom, Wuhan, hoard*, toilet paper, curbside, pandemic, reopen, defy, anti-mask*.

You can also look at "collocates" (nearby words) to see what is being said about a certain topic, such as (verbs near) virus, or any word near ban (v), stockpile, disinfect*, or remotely. And you can even see the collocates of a word in each month since Jan 2020 (e.g. stockpile).

The corpus also allows you to see the patterns in which a word occurs, as with stay-at-home, social, economic, or hoard*.

You can also compare between different time periods, to see how our view of things have changed over time. (And you can even compare between the 20 countries in the corpus). Interesting comparisons over time might include phrases like social * or economic *, or words near BAN or OBEY, which were more common after the pandemic started (e.g. mid-2020) than in Jan/Feb 2020.

Click on any of the links in the search form on the search page for context-sensitive help, and to see the range of queries that the corpus offers (LIST discusses the search syntax). You might pay special attention to the virtual corpora, which allow you to create personalized collections of texts related to a particular area of interest.

Finally, the corpus is related to other corpora from English-Corpora.org, which are the most widely used corpora of English and which offer unparalleled insight into variation in English.