| 
 
		 
		Note: click RETURN in the upper
		right-hand corner to return to this page, after clicking on any of the
		links below. 
	
		| 
		 Note: these help files are for the older
		interface, which was used before May 2016. If you are using the newer
		interface, the layout will be slightly different, but the functionality
		is the same.  | 
	 
 
The
		Wikipedia corpus from English-Corpora.org, which was
released in early 2015, contains 1.9 billion words in 4.4 million web pages,
		and you can search the entire corpus with the same
		type of queries as the other corpora from English-Corpora.org. 
More importantly, though, you can
also quickly and easily create "virtual" corpora "on the fly" for any
topic that you want, such as: 
    
biology,
investments,
Buddhism,
psychology,
cars,
basketball. 
The topics can be as narrow as you want, including maybe just 5-10 different
Wikipedia pages. 
Once you have created these corpora
via the web interface, you can then quickly and easily search in the corpora.
First, you can find keywords, such as nouns in: 
    biology,
investments,
Buddhism,
psychology,
cars, or
basketball
(overall frequency) 
    biology,
investments,
Buddhism,
psychology,
cars, or
basketball
(more specific words for these corpora) 
Of course, you can search for other words too, for example, such as
verbs in
Buddhism, adjectives in
biology, or
noun+noun in
investments. 
In addition to finding keywords,
you can also search within your virtual corpora, such as matching words (e.g.
financ*),
strings of words (e.g.
 market + NOUN), collocates (e.g. of 
market),
and concordance lines (e.g. for 
market).  (All of these examples are
from the investments corpus, but you can obviously do searches for any corpus
you create.) 
There are a number of tutorials
for the corpus on YouTube (*= alternate site, if YouTube is
not accessible in your country) 
	
		| 
		General topic | 
		Length | 
		Individual topics | 
	 
	
		
		  
		Overview
		* | 
		8:59 | 
		- Creating virtual corpora 
		- Finding keywords in your corpora 
		- Basic searches in your corpus (frequency, strings, collocates,
		concordances) 
		- Editing and managing your corpora | 
	 
	
		
		 
		Creating virtual
		corpora: basic 
		* | 
		2:57 | 
		- Creating corpora by word or phrase in the Wikipedia
		article 
		- Creating corpora by the title of the Wikipedia article  | 
	 
	
		
		 
		Finding keywords
		in your corpus 
		* | 
		4:55 | 
		- Frequency listing of corpus by part of speech (noun,
		verb, adjective, adverb) 
		- Frequency listing by multi-word expression (Noun+Noun, Adj+Noun) 
		- Finding words that are more specific to your corpus | 
	 
	
		
		 
		Searching within
		your corpus 
		* | 
		5:58 | 
		- Frequency listings (substrings) 
		- String search, e.g. market + NOUN 
		- Collocates (nearby words); useful insight into meaning and usage of
		word 
		- Concordance lines (re-sortable); see the patterns in which a word
		occurs | 
	 
	
		
		 
		Comparing across
		corpora * | 
		4:33 | 
		- Finding the frequency in the different corpora that
		you've created 
		- Example: the frequency of words for "obedience" in different religions 
		- Example: the frequency of the word gods in different religions 
		- Comparing concordance lines, e.g. stress in engineering and
		psychology | 
	 
	
		
		 
		Managing your
		corpora * | 
		3:04 | 
		- Deleting your virtual corpora 
		- "Hiding" or "ignoring" corpora (without completely deleting them) 
		- Renaming corpora 
		- Grouping virtual corpora by topic (e.g. science or finance)  | 
	 
	
		
		 
		Editing your
		corpora * | 
		7:24 | 
		- Deleting individual pages from a corpus 
		- Deleting pages from your corpus from concordance lines 
		- Moving pages from one corpus to another 
		- Adding pages from one corpus to another 
		- Searching for words and then adding multiple pages to an existing
		corpus | 
	 
	
		
		 
		Creating virtual
		corpora: advanced 
		* | 
		6:53 | 
		- Comparison of searching by words in text and
		searching by title 
		- When searching by title is better than searching by words in text 
		- When searching by title (alone) may not be enough 
		- By title: adding words that are not in the title 
		- By title: adding words that are or are not in the next of the page | 
	 
 
  
  
  
  
  
  
		   |