Texts
The Excel spreadsheet shown above contains complete information on the
texts used in the corpus. The [textID] column refers to the [t] value in
the URL in the [title] column of the Keyword in Context display. For
example, if the URL is http://www.english-corpora.org/bnc/x4.asp?t=APK&ID=92380534,
then the [textID] is [APK], and this corresponds to [Quest for a babe]
in the speadsheet. For more details on the composition of the BNC, see
http://www.natcorp.ox.ac.uk/corpus/index.xml.
The following is the number of words in the different sections of the BNC.
These correspond to the column headers in [CHART] view searches and the
[GENRE] column of the spreadsheet.
Overview |
|
Details |
Genre |
# words |
# texts |
Spoken |
10,334,947 |
909 |
Fiction |
16,194,885 |
464 |
Magazine (W_pop_lore) |
7,376,391 |
211 |
Newspaper |
10,638,034 |
518 |
Non-academic |
16,634,076 |
534 |
Academic |
15,429,582 |
501 |
Miscellaneous |
20,835,159 |
917 |
TOTAL |
97,626,093 |
4054 |
|
|
Spoken |
9,963,663 |
S_brdcst_disc |
736,229 |
S_brdcst_doc |
40,554 |
S_brdcst_news |
254,370 |
S_classroom |
412,372 |
S_consult |
131,354 |
S_conv |
4,012,457
|
S_courtroom |
125,438 |
S_demonstratn |
30,500 |
S_interv_oral |
798,978 |
S_interview |
119,117 |
S_lect_arts |
49,759 |
S_lect_com |
14,757 |
S_lect_law |
49,774 |
S_lect_natsci |
22,168 |
S_lect_socsci |
154,718 |
S_meeting |
1,334,382
|
S_parliament |
95,025 |
S_pub_debate |
278,458 |
S_sermon |
80,135 |
S_spch+script |
196,615 |
S_spch-script |
448,810 |
S_sportslive |
32,103 |
S_tutorial |
138,888 |
S_unclass |
406,702 |
Fiction |
16,194,885 |
W_fict_drama |
44,975 |
W_fict_poetry |
219,409 |
W_fict_prose |
15,644,928 |
Magazine |
7,376,391 |
W_pop_lore |
7,261,990
|
Newspaper |
10,638,034 |
W_new_arts1 |
345,860 |
W_news_arts2 |
235,525 |
W_news_com |
416,345 |
W_news_edit |
100,659 |
W_news_misc |
1,019,839
|
W_news_o_com |
407,277 |
W_news_o_rep |
2,681,576
|
W_news_o_sci |
54,327 |
W_news_o_soc |
1,125,324
|
W_news_o_sprt |
1,009,878
|
W_news_rprt |
655,508 |
W_news_sci |
64,634 |
W_news_script |
1,262,351
|
W_news_soc |
80,963 |
W_news_sprt |
292,832 |
W_news_tabld |
713,524 |
Non-academic |
16,634,076 |
W_non_ac_arts |
3,722,655
|
W_non_ac_engin |
1,186,625
|
W_non_ac_law |
4,450,696
|
W_non_ac_med |
495,734 |
W_non_ac_nat |
2,491,219
|
W_non_ac_soc |
4,148,256
|
Academic |
15,429,582 |
W_ac_engin |
678,621 |
W_ac_hum_arts |
3,296,072
|
W_ac_law_edu |
4,615,173
|
W_ac_medicine |
1,412,808
|
W_ac_nat_sci |
1,104,527
|
W_ac_soc_sci |
4,224,467
|
Miscellaneous |
20,835,159 |
W_admin |
218,595 |
W_advert |
549,856 |
W_biography |
3,494,374
|
W_commerce |
3,729,662
|
W_email |
209,815 |
W_essay_schl |
145,041 |
W_essay_univ |
55,477 |
W_hansard |
1,149,732
|
W_inst_doc |
542,553 |
W_instruction |
433,932 |
W_let_pers |
51,840 |
W_let_prof |
65,511 |
W_misc |
9,074,079
|
W_religion |
1,114,692
|
TOTAL |
96,263,399 |
|
|