DOWNLOAD COMPLETE TOKEN COUNT -- EACH DECADE, 1500s-2000s

The following shows the number of words (in billions) for the different Google Books datasets. After doing a search in one dataset, you can quickly and easily re-do the same search in another database by clicking on one of the links after COMPARE in the header above. This will allow you, for example, to quickly compare a number of phenomena in British and American English. Note also that although all of these datasets are available via this interface, nearly all of the examples in the help files come from just the American English dataset.

Dataset	Total	1500- 1799	1800- 1899	1900- 1909	1910- 1919	1920- 1929	1930- 1939	1940- 1949	1950- 1959	1960- 1969	1970- 1979	1980- 1989	1990- 1999	2000- 2009
American	157.0	0.04	22.8	7.5	10.1	7.1	5.8	6.2	8.1	13.2	14.0	15.5	19.8	26.9
British	34.0	0.77	11.4	2.1	0.9	1.3	1.1	0.8	1.5	1.8	1.8	2.1	2.9	5.4
One Million Books	89.0	0.64	32.2	5.3	4.8	4.9	5.0	5.1	5.4	5.2	4.9	5.3	5.5	4.8
Fiction	90.7	0.32	12.3	3.4	3.2	2.9	2.4	2.4	3.5	4.7	5.7	8.2	12.3	29.4
Spanish	45.1	0.32	3.8	0.9	0.8	0.9	1.3	2.1	2.6	3.9	5.4	6.3	7.7	8.9

As far as the content of each dataset, the American and British datsets should be self-explanatory. "Fiction" is both American and British. The "One Million Books" datset is a subset of the entire English set, and contains just those books whose OCR quality is the best, and it t is also more balanced by subject for the last 100 years or so.

Please remember that the frequency listings from the n-grams are for the particularly database that you are searching. But when you then click to see the extracts from Google Books, you are seeing the extracts from ALL datasets. There is unfortunately no way around this. (More information...)