Disclaimer on AI-Generated Analyses

Some of the analyses and classifications on this site are generated in part by large language models (LLMs), such as OpenAi/GPT, Google/Gemini, or Anthropic/Claude. These models have been integrated to provide helpful approximations based on patterns in the data, offering insights that would be difficult or time-consuming to obtain manually. However, users should be aware that these LLM-based analyses will at times not be overly insightful, and that they may occasionally contain errors, oversimplifications, or what are sometimes referred to as “hallucinations” — outputs that appear plausible but are not supported by the underlying data.

We encourage users to approach these results critically, as they would any tool or methodology. At the same time, it is important to maintain a balanced perspective. Focusing exclusively on a small set of problematic cases, while ignoring the vast majority of accurate and useful results, does not represent sound scholarly practice. As with all tools, LLM-based output should be judged by its overall utility and reliability in context, rather than by isolated exceptions.

So, for example, it would not represent sound scholarly practice to use LLMs to group collocates for 50 words, and then include in a publication only the 4 or 5 instances where the output is subpar. Similarly, using LLMs to generate explanations for 30 chart displays, but then highlighting only the few that are unhelpful or off-target, presents a skewed view of the tool’s performance. A more rigorous approach would involve selecting a representative or randomly sampled set of cases -- ones that other researchers could independently replicate -- and then evaluating the overall performance. From such a sample, one could meaningfully report the percentage of outputs that are insightful, uninformative, or inaccurate, and assess the tool's usefulness based on its general reliability, rather than exceptional failures.

We welcome constructive feedback and empirical evaluation of these analyses, especially when such critiques are grounded in representative data and informed by a clear understanding of corpus linguistics and AI limitations.