English-Corpora.org

English-Corpora.org


CORPORA AND AI / LLMs: Citing the white papers Overview | Notes | Citing

Mark Davies / March 2025

Please use the following as guidelines for referring to the "white papers" from English-Corpora.org/ai-llms/. Feel free to adapt these to your particular reference style (APA, ACM, MLA, etc)

General (all papers)

Davies, Mark (2025). Comparing the predictions of Large Language Models to actual corpus data. (White papers). English-Corpora.org. https://www.english-corpora.org/ai-llms/

Summary/overview paper

Davies, Mark (2025). Corpora and LLMs: introduction and overview. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/overview.html


1. Word frequency

Davies, Mark (2025). Corpora and LLMs: comparing data on word frequency. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/word-frequency.pdf

2. Phrase frequency

Davies, Mark (2025). Corpora and LLMs: comparing data on phrase frequency. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/phrase-frequency.pdf

3. Collocates

Davies, Mark (2025). Corpora and LLMs: comparing collocates data. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/collocates.pdf

4. Comparing words via collocates)

Davies, Mark (2025). Corpora and LLMs: comparing words via collocates. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/compare-words.pdf


5. Genre-based variation

Davies, Mark (2025). Corpora and LLMs: genre-based variation. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/genres.pdf

6. Historical variation

Davies, Mark (2025). Corpora and LLMs: historical variation. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/historical.pdf

7. Dialectal variation

Davies, Mark (2025). Corpora and LLMs: dialectal variation. (White paper). English-Corpora.org. https://www.english-corpora.org/ai-llms/dialects.pdf