DOWNLOAD COMPLETE TOKEN COUNT -- EACH DECADE, 1500s-2000s

The following shows the number of words (in billions) for the different Google Books datasets. After doing a search in one dataset, you can quickly and easily re-do the same search in another database by clicking on one of the links after COMPARE in the header above. This will allow you, for example, to quickly compare a number of phenomena in British and American English. Note also that although all of these datasets are available via this interface, nearly all of the examples in the help files come from just the American English dataset.

Dataset Total 1500-
1799
1800-
1899
1900-
1909
1910-
1919
1920-
1929
1930-
1939
1940-
1949
1950-
1959
1960-
1969
1970-
1979
1980-
1989
1990-
1999
2000-
2009
American 157.0 0.04 22.8 7.5 10.1 7.1 5.8 6.2 8.1 13.2 14.0 15.5 19.8 26.9
British 34.0 0.77 11.4 2.1 0.9 1.3 1.1 0.8 1.5 1.8 1.8 2.1 2.9 5.4
One Million Books 89.0 0.64 32.2 5.3 4.8 4.9 5.0 5.1 5.4 5.2 4.9 5.3 5.5 4.8
Fiction 90.7 0.32 12.3 3.4 3.2 2.9 2.4 2.4 3.5 4.7 5.7 8.2 12.3 29.4
Spanish 45.1 0.32 3.8 0.9 0.8 0.9 1.3 2.1 2.6 3.9 5.4 6.3 7.7 8.9

As far as the content of each dataset, the American and British datsets should be self-explanatory. "Fiction" is both American and British. The "One Million Books" datset is a subset of the entire English set, and contains just those books whose OCR quality is the best, and it t is also more balanced by subject for the last 100 years or so.

Please remember that the frequency listings from the n-grams are for the particularly database that you are searching. But when you then click to see the extracts from Google Books, you are seeing the extracts from ALL datasets. There is unfortunately no way around this. (More information...)