4.2 Article

Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative Methods

Journal

JOURNAL OF QUANTITATIVE LINGUISTICS
Volume 20, Issue 3, Pages 178-208

Publisher

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD
DOI: 10.1080/09296174.2013.799918

Keywords

-

Funding

  1. Economic and Social Research Council [RES-000-23-1286] Funding Source: researchfish
  2. ESRC [RES-000-23-1286] Funding Source: UKRI

Ask authors/readers for more resources

Many statistical methods rely on an underlying mathematical model of probability based on a simple approximation, one that is simultaneously well-known and yet frequently misunderstood. The Normal approximation to the Binomial distribution underpins a range of statistical tests and methods, including the calculation of accurate confidence intervals, performing goodness of fit and contingency tests, line- and model-fitting, and computational methods based upon these. A common mistake is in assuming that, since the probable distribution of error about the true value in the population is approximately Normally distributed, the same can be said for the error about an observation. This paper is divided into two parts: fundamentals and evaluation. First, we examine the estimation of confidence intervals using three initial approaches: the Wald (Normal) interval, the Wilson score interval and the exact Clopper-Pearson Binomial interval. Whereas the first two can be calculated directly from formulae, the Binomial interval must be approximated towards by computational search, and is computationally expensive. However this interval provides the most precise significance test, and therefore will form the baseline for our later evaluations. We also consider two further refinements: employing log-likelihood in intervals (also requiring search) and the effect of adding a continuity correction. Second, we evaluate each approach in three test paradigms. These are the single proportion interval or 2 x 1 goodness of fit test, and two variations on the common 2 x 2 contingency test. We evaluate the performance of each approach by a practitioner strategy. Since standard advice is to fall back to exact Binomial tests in conditions when approximations are expected to fail, we report the proportion of instances where one test obtains a significant result when the equivalent exact test does not, and vice versa, across an exhaustive set of possible values. We demonstrate that optimal methods are based on continuity-corrected versions of the Wilson interval or Yates' test, and that commonly-held beliefs about weaknesses of tests are misleading. Log-likelihood, often proposed as an improvement on , performs disappointingly. Finally we note that at this level of precision we may distinguish two types of 2 2 test according to whether the independent variable partitions data into independent populations, and we make practical recommendations for their use.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available