Collocates (nearby words) often provide useful insight into
word meaning and usage. Note that collocates in the Google Books
corpus work somewhat differently than in the
other corpora; see
notes #2-5 below.
See an explanation of what happens if you don't enter anything in the [COLLOCATES] field
Finds  within  words to the left
words to the right of . Click on any of the links below to run
|1||2||3 / 4||Explanation||Examples|
|thick||[nn*]||0/3||thick followed within 3 words by a noun||glasses, smoke|
|efforts||[vv*]||3/0||a verb within three words before efforts||making, redouble|
|flashed||[nn*]||3/0||a noun within three words before flashed||eyes, lightning|
|lips||[j*]||2/0||an adjective within two words before lips||rosy, soft|
|wildly||*||0/2||any word within 2 words after wildly||enthusiastic, cheered|
|quickly||*||3/0||any word within 3 words before quickly||very, turned|
|find||time||0/3||find followed within 3 words by time||time|
|mood||love||0/3||mood followed within 3 words by love||love|
2. Remember that the regular Google Books site can only search for exact words or phrases. You can't ask it to find Word X near Word Y. Therefore, the collocates searches from this site have to be a two-step process. First, we find Words Y near Word X (Step 1). Then you click on any collocates in the results list to see the different strings where Word Y is near Word X (Step 2), and you can click on any of these exact string to see them in the Google Books extracts.
3. Remember also that long strings (4-grams and especially 5-grams) often don't work as well as shorter strings, because of the 40-token threshold. This is especially true for collocates. For example, you will get more (and better) collocates of ardent when the "span" is set to two words after ardent than when it is set to four words, and you will get more collocates of chips when the span is set to two words before than when it is set to four words.
4. The collocates searches for the Google Books corpus currently does not allow for the wide range of searches that exist with the other corpora (such as lemma, alternates, synonyms, multiple words as the node word, etc). Right now it is only possible to have one exact word as the "node" word, and either part of speech, one exact word, or [*] as the collocate. More options will be added over the next few months, until the Google Books corpus has the same range of collocates searches as the other corpora.
5. Remember that the n-grams are not tagged for part of speech. This presents problems when you want to find the collocates of a word with a certain part of speech, such as chip as a noun or beat as a verb. Related to #4, soon it will be possible to have multiple words as the node word -- e.g. [a*] chip (the chip, a chip) to specify noun, or to beat or [vm*] beat (can beat, will beat) to specify verb.