Click on any of the links below for more information and samples of this data.
95% of the text from the 14 billion words of text, including a listing of all 22+ million web pages used in the corpus