English-Corpora.org

English-Corpora.org



  KWIC (concordance) searches   (search form, corpora used, corrections, +/- sections compare to Word / phrase and collocates searches)

Note: click on any link on this page to see the corpus data, and then click on the "BACK" image (see left) at the top of the page to come back to this page. Or right click on the link and then "Open link in new tab" (in Chrome; similar in other browsers), and then close that tab after viewing the corpus data.

Keyword in Context (KWIC, and also known as concordance) searches allow you to see the patterns in which a word occurs. Word/phrase searches allow you to search for specific strings of words. Collocates searches allow you to see the nearby words in a "cloud of words" around a given word. But KWIC searches allow you to see the patterns in which a word occurs, in ways that can't be done with the other two types of searches.

Well, this is mostly true. Let's take the example of fathom, which means "to understand". Suppose we want to know the patterns in which this word occurs. We could search for * * fathom to see the two word strings that occur to the left of fathom, and we would see that they are negative, or that they at least mention that it is hard or difficult to fathom:

 

But should we search one word to the left, or two words, or three? Should we sort by words to the left or  right of the "node word" (in this case fathom) to find interesting patterns? And how could we see phrases that occur in the context to the right or the left? This is exactly what KWIC searches do the best.

For example, look at the following partial entries from the KWIC lines for fathom. (Note that the saved search linked to above will replicate the search, but not the exact KWIC lines. You will see similar KWIC lines, but probably not the exact same ones as in these images.)

We can easily scan through these lines and see that the preceding words -- whether one, two, three words (or more) -- all express the "negative / questioning" sense accompanies the word fathom. (This is a good example of semantic prosody.)

And look at these KWIC lines for naked eye. We can quickly scan through the entries to find repeated phrases, like visible/invisible to the naked eye or see X with the naked eye. It would be much more difficult to find patterns like this by just looking for exact four or five words strings, or with collocates (which just find single words somewhere nearby another word).

Consider these lines with gone the way of the, which refers to things that have gone extinct or disappeared (and this is just a small portion of the results). We can quickly scan through the entries and see phrases like dinosaurs, dodo (birds), rotary phones, side rule (calculator), and typewriters. Again, it would be difficult to find phrases like this with just phrase searches like gone the way of the * * or by using collocates (which will just find single words, not phrases).

Finally, consider the phrase END up.  Because the words (three words to the left and three words to the right of our search word) are color coded, we can quickly scan through the entries to see what parts of speech occur near our search word (in corpus linguistics, this is referred to as colligation). We see that with end up, there are many -ing forms of verbs (in pink), a few adjectives (in green), and prepositional phrases with the word with (in yellow). Patterns of co-occurring words and phrases and parts of speech almost "jump out" at us with these KWIC searches.


 


How to set up the search

Select the words that you want to sort with. Select L for 1, 2, and 3 words to the left (as with fathom and naked eye above). In other words, it first sorts the words by 1 "slot" to the left, and then (in cases of a "tie") by the second word to the left, and then the third word to the left. Select R for 1, 2, and 3 words to the right (as with go the way of the and END up above). You could also, for example, sort by one word to the left, then one and two words to the right, such as with matter (n) or point (n), or any other combination. Click * to clear the entries and start over.

You can also select the number of entries -- 100, 200, 500, or 1000 entries

After you have done the search, you can also re-sort the entries from within the KWIC entries page, by clicking on the table at the top right (shown here in red). As before, you can sort by L (left), R (right), or manually select any combination of these. And as before, click on * to clear the entries and start over. You can also see a key to the color codes for part of speech, copied here:

As with word/phrase and collocates searches, you can also limit the KWIC search by section in the corpus. For example, the following are entries for END up in COHA (the Corpus of Historical American English) in the 1820s-1890s and then in the 1980s-2010s. Notice how there are no cases of VERB-ing in the 1800s (i.e. no words in pink after END up), but those are the majority of the entries in the 1980s-2010s.