()   

 

  Download the corpus (and corpus-based frequency data) for offline use

See randomly-selected words from the top 60,000 words

"Words of the Day": 20 words from 10 different frequency levels

PDF overview     Five minute tour

The iWeb corpus was created by Mark Davies, and it contains 14 billion words in 22 million web pages. It is related to other corpora from English-Corpora.org, which are the most widely used corpora of English, and which offer unparalleled insight into variation in English.

Unlike other large corpora from the web, the nearly 95,000 websites in iWeb were chosen in a systematic way, and the websites have an average of 240 web pages and 145,000 words each. You can very easily and quickly focus on specific websites to create "virtual corpora" for any topic, such as buddhism, chocolate, basketball, or nuclear energy.

There are four main ways to search the corpus.

First, you can browse a frequency list of the top 60,000 words in the corpus, including searches by word form, part of speech, ranges in the 60,000 word list, and even by pronunciation. This should be particularly useful for language learners and teachers.

Second, you can search by individual word, and see collocates, topics, clusters, websites, concordance lines, and related words for each of these words. Note that some of these searches are unique to iWeb and COCA.

Third, you can search for phrases and strings. And because the corpus is optimized for speed, searches for substrings (*ism, un*able) and phrases are very fast, e.g.: got VERB-ed, BUY * ADJ NOUN, "gorgeous" NOUN -- and even high frequency phrases like: from ADJ to ADJ, phrasal verbs, or NOUN NOUN.

Finally, you can find random words and also browse through randomly-selected "Words of the Day", and then save new words and come back and review them later.

Click on any of the links in the search form on the search page (such as List or Chart) for context-sensitive help, and to see the range of queries that the corpus offers. And you might want to check out the new (Sep 2024) expanded help files.