Oscillation Phenomenon of Binomial Confidence Intervals
J Reed III
Keywords
confidence intervals, coverage probability.
Citation
J Reed III. Oscillation Phenomenon of Binomial Confidence Intervals. The Internet Journal of Epidemiology. 2009 Volume 8 Number 1.
Abstract
One of the most basic and important problems in statistical practice is constructing an interval estimation of the probability of success. The textbook binomial confidence interval that has near universal acceptance is the familiar Wald binomial confidence interval (Wald-z). In recognition that the actual coverage probability of Wald-z is poor for
Introduction
The textbook binomial confidence interval that has near universal acceptance is the familiar Wald binomial method (Wald-z). This interval is easy to compute and just as easily interpreted. This confidence interval is typically presented along with a justification based on the central limit theorem. Most users believe that the larger the sample size, n, the better the normal approximation resulting in the actual coverage approaching the nominal level 1 - α. Textbook authors recognize that the actual coverage probability of Wald-z is poor for
Inadequate coverage of the Wald-z interval can be erratic, even when
Figure 1
Are there better alternative binomial confidence interval construction methods to Wald-z? Alternative methods include the “gold standard' Clopper-Pearson method (Clopper and Pearson, 1937), Wilson's Score, Wilson's Score with a continuity correction (Wilson, 1927), and adjusted Wald methods (Santner, 1998; Agresti and Coull, 1998; Borkowf, 2005). All were initially evaluated in terms of their overall coverage probability for [0, 1] for a fixed
Numerical Methods
When constructing a confidence interval we would like the actual coverage probability to be close to the nominal confidence interval. However, because of the discrete nature of the binomial distribution, this is not always possible (Agresti 1998, Newcombe 1998). We use the coverage probability, CP (|
CP (|
The Wald-z binomial confidence interval computational formula, with or without a continuity correction, is found in virtually every statistics text. These two methods are labeled by Santner as the z and c-intervals (Santner, 1998). Santner also identifies a t-interval method that replaces α-quantile of the standard normal with tn-1,α - quantile of the standard t-distribution with
Wald (z-Interval) LB =
UB =
Wald (c-Interval) LB =
UB =
Wald (t-Interval) LB =
UB =
Wald (q-Interval) LB = {
UB = {
Where:
The Clopper-Pearson (CLP) binomial confidence interval is the best-known “exact” method and is considered by most to be the “gold standard” (Clopper and Pearson, 1934). The CLP lower and upper limits are defined by:
Clopper-Pearson CLP LB=0 if x = 0, (/2)1/n if x =
LB=[1+(
UB=1 - (/2)1/n if x=0, 1 if x =
UB=[1 + (
Two methods attributed to Wilson (Wilson, 1927) are the Score (S) and Score with continuity correction (SC) (Wilson, 1924). The LB and UB are defined by:
Score LB=(2
UB=(2
Score - C LB=[2
UB=[2
Agresti and Coull proposed a straightforward adjustment - the “add 4 to Wald” that simply adds two successes and two failures and then uses the Wald-z formula (Agresti and Coull, 1998). Alternatively, one could add z2/2 successes and z2/2 failures before computing the Wald-z confidence interval. The Agresti-Coull (AC) lower and upper bounds are:
Agresti - Coull (AC) LB=
UB=
Borkowf augments the original data with a single imaginary failure to compute the lower confidence bound and a single imaginary success to compute the upper confidence bound - a single augmentation with an imaginary failure or success (SAIFS) method (Borkowf, 2005). The lower and upper confidence bounds for this method are:
Borkowf - z LB =
UB =
1-/2 = α-quantile of the standard normal distribution
Results
Figure 2 shows the oscillation of the coverage probability of the nominal 95% Wald-z, Wald-c, Wald-t, Wald-q, CP, Score, Score-c, Agresti-Coull, and Borkowf-z methods for p = 0.1 with
Figure 2
The average coverage probability for this set of methods with the minimum and maximum CP (|
Figure 11
The proportion of times when CP (|
Figure 12
Figure 3 shows the coverage probability of the nominal 95% Wald-z, Wald-c, Wald-t, Wald-q, CP, Score, Score-c, Agresti-Coull, and Borkowf-z methods for = 0.3 with
The oscillation phenonomen continues when = 0.3 (Figure 3). There is an increase in the number of
Figure 13
The proportion of times when CP (|
The average coverage probability, minimum and maximum CP (|
Table 2, Column 5 lists the proportion of times that these methods achieve a nominal 95% coverage. The Wald-z and Wald-t continue to demonstrate their inadequate coverage at 30.5% and 43.3%, respectively. Clopper-Pearson and Score-c methods are again conservative followed closely by the Borkowf-z method (92.2%). The Agresti-Coull and Score methods perform slightly better than chance (53.2% and 52.5%, respectively).
Conclusion
The Score-c and Clopper-Pearson confidence interval methods have coverage probabilities that are bounded below by the nominal confidence level. These two methods may be considered to be too conservative, but both guarantee the actual coverage probability to meet or exceed the nominal level. Agresti, Coull, and Borkowf have proposed methods that have reasonable coverage probability properties and lessen the reliance on “luck.”
In forming a 95% confidence interval, it would seem better to use a method that guarantees that the actual coverage probability is at least 0.95. The three Wald type confidence intervals methods should be put on the shelf {Wald-z, Wald-c, and Wald-t}. This set of methods, even when adhering to the general warning or 'rule of thumb' do not reach the nominal coverage probability nearly enough to warrant their use.