Keywords allow you to see what words occur much more in your virtual corpus than in the overall corpus. Note that the entries below are from a [BIOLOGY] virtual corpus created in the Wikipedia corpus, but the same principles apply to your virtual corpora.

BIOLOGY [135,240 WORDS, 100 TEXTS] NOUN VERB ADJ ADV N+N ADJ+N A

[ALL CORPORA] SAVE LIST B

Note: you can click on the [FREQ], [# TEXTS], or [SPECIFIC] columns to re-sort the entries in the table.

[A] Click to see keywords for another part of speech (noun, verb, adjective, adverb, noun+noun, or adjective+noun)

[B] You can create a link to share this list of keywords with others. They can then click on that link (in a web page, an email etc) to see this same exact list.

[C] Click on the word to see it in context. (Note that if you click on a word in the list in the frame above, you'll need to go back one page in your browser to see this help page.)

[D] The number of times the word was used in the corpus.

[E] The number of articles in which the word was used. This may be better that [FREQ], when you want to make sure that the word isn't just occurring a lot of times in a small number of articles.

[F-J] Sort and limit the entries to words that are more specific to this corpus. For example, if you sort by [FREQ] then words like type or body might be the most frequent. But if you click on [SPECIFIC], then more corpus-specific words will be placed higher in the list. You can make the list more [F] or less [H] specific to the corpus by clicking on these buttons (and then make sure you click on SPECIFIC again after [-] or [+]). This will change the minimum frequency for the words [I] and the minimum number of texts in which the word must occur [J], and you can also change those values manually. The numbers in this column show how much more frequent the word is in the corpus, based on 1) the total frequency of that word in all of Wikipedia and 2) the size of the corpus that you've created.

[K] The number of times the word was used by all articles in all 4.4 million articles in Wikipedia. For example, species was used a total of 585,820 times in all articles.

[L] Based on the total number of words by your corpus, how many times would we expect that word to occur (if they used it at the same rate as all other articles).

	WORD (CLICK TO SEE) C	FREQ D	# TEXTS E	F SPECIFIC G H I FREQ TEXTS J	ENTIRE CORPUS K	EXPECTED L
1	CELL	834	53	100.0	114,074	8.3
2	SPECIES	497	47	11.6	585,820	42.8
3	BIOLOGY	495	56	136.0	49,803	3.6
4	PROTEIN	428	38	82.0	71,371	5.2