Why the name iWeb? Web of course refers to the fact that the corpus is based on about 14 billion words in 22 million web pages from about 95,000 websites.
 

The i stands for several things:

  • Immense: 14 billion words is huge. This is one of only three structured corpora of English that are larger than about 10 billion words.

  • Nearly instantaneous. iWeb is much faster than these other large corpora. Even searches like BUY * ADJ NOUN, "gorgeous" NOUN, VERB + reflexive, or NOUN NOUN take just 2-3 seconds, and searches for topics, collocates, clusters, websites, and concordance lines (all of these for the sample word bread) take one second or less.

  • Insightful: other large corpora from the web are just a huge "blob of data" -- whatever has been scooped up from blind web scraping. iWeb was designed from the ground up to allow you to target specific topics and websites, such as websites dealing with buddhism, chocolate, basketball, or nuclear energy.

  • Informative: iWeb allows you to browse through the 60,000 words (lemmas) in the corpus, and to see a wealth of information for each word.

  • Integrated: with just one click you can move from word to word, and between collocates, related topics, clusters, websites, and concordance lines. In addition, the "word" pages are integrated with other resources for images, videos, pronunciations, and translations.

No other corpus allows you the size, speed, and range of queries that you will find in iWeb.