2
$\begingroup$

This was something I thought about:

  1. If you have data from a coin with a very low probability of success, it is possible that the Wald confidence interval can contain negative numbers and be illogical? But if you do bootstrap, the ci should never be illogical, i.e. worse case scenario the lower bound will always be exactly 0:

I verified this in R:

set.seed(123)
p_true <- 0.001
n <- 1000
n_sims <- 100
n_boot <- 100

wald_negative <- 0
bootstrap_negative <- 0

for(i in 1:n_sims) {
  x <- rbinom(n, 1, p_true)
  successes <- sum(x)
  
  p_hat <- successes / n
  
  se <- sqrt(p_hat * (1 - p_hat) / n)
  wald_lower <- p_hat - 1.96 * se
  wald_upper <- p_hat + 1.96 * se
  
  if(wald_lower < 0) {
    wald_negative <- wald_negative + 1
  }
  
  boot_estimates <- numeric(n_boot)
  for(j in 1:n_boot) {
    boot_x <- sample(x, n, replace = TRUE)
    boot_estimates[j] <- sum(boot_x) / n
  }
  
  boot_lower <- quantile(boot_estimates, 0.025, 
                         names = FALSE)
  boot_upper <- quantile(boot_estimates, 0.975, 
                         names = FALSE)
  
  if(boot_lower < 0) {
    bootstrap_negative <- bootstrap_negative + 1
  }
}

cat("Results from", n_sims, "simulations:\n")
cat("Wald CI negative:", wald_negative, 
    "times (", round(100 * wald_negative / n_sims, 1), 
    "%)\n")
cat("Bootstrap CI negative:", bootstrap_negative, 
    "times (", round(100 * bootstrap_negative / n_sims, 1), 
    "%)\n")
  1. If you have data from an exponential random variable with a rate parameter very close to being 0, it is possible that the Wald confidence interval can contain negative numbers and be illogical? But if you do bootstrap, the ci should never be illogical. i.e. worse case scenario the lower bound will always be exactly 0

I also verified this in R:

set.seed(123)
lambda_true <- 0.001
n <- 100
n_sims <- 100
n_boot <- 100

wald_negative <- 0
bootstrap_negative <- 0

for(i in 1:n_sims) {
    x <- rexp(n, rate = lambda_true)
    
    lambda_hat <- 1 / mean(x)
    
    se <- lambda_hat / sqrt(n)
    wald_lower <- lambda_hat - 1.96 * se
    wald_upper <- lambda_hat + 1.96 * se
    
    if(wald_lower < 0) {
        wald_negative <- wald_negative + 1
    }
    
    boot_estimates <- numeric(n_boot)
    for(j in 1:n_boot) {
        boot_x <- sample(x, n, replace = TRUE)
        boot_estimates[j] <- 1 / mean(boot_x)
    }
    
    boot_lower <- quantile(boot_estimates, 0.025, 
                           names = FALSE)
    boot_upper <- quantile(boot_estimates, 0.975, 
                           names = FALSE)
    
    if(boot_lower < 0) {
        bootstrap_negative <- bootstrap_negative + 1
    }
}

cat("Results from", n_sims, "simulations:\n")
cat("True lambda:", lambda_true, "\n")
cat("Wald CI negative:", wald_negative, 
    "times (", round(100 * wald_negative / n_sims, 1), 
    "%)\n")
cat("Bootstrap CI negative:", bootstrap_negative, 
    "times (", round(100 * bootstrap_negative / n_sims, 1), 
    "%)\n")

If this is true, is the bootstrap CI always more advantageous than the Wald CI? Now with modern computers where simulation is not a problem, wont bootstrapping CI almost always be better (ie avoid illogical problem) if not same as Wald?

New contributor
bootstrap is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
$\endgroup$
6
  • 4
    $\begingroup$ As bootstrapping resamples the original data, a bootstrap sample can never include impossible values (unless they occur through measurement error). Turn and turn about, consider bootstrapping maximum or minimum; a bootstrap sample can never include values beyond the observed maximum or minimum and confidence intervals will not be helpful. This is I believe an utterly standard example but the reminder may help. $\endgroup$
    – Nick Cox
    Commented yesterday
  • 1
    $\begingroup$ I wonder how bad a problem the "illogical problem" in your sense actually is. If you know the minimum is 0, you can just cut the values lower than 0 from your CI, problem solved. $\endgroup$ Commented yesterday
  • 2
    $\begingroup$ Note by the way that Wald (and anything that computes a symmetric interval based on standard error) implicitly assumes a normal distribution with negative values well possible. Under this assumption the result isn't "illogical". Application to other settings relies on the central limit theorem, i.e., a large enough sample, and it is well known that in a binomial problem with a very low success probability the required sample size is quite large. Note also that the bootstrap for small samples can be quite imprecise, "illogical" or not. $\endgroup$ Commented yesterday
  • 2
    $\begingroup$ This thread is very much related regarding the Wald CI. $\endgroup$ Commented yesterday
  • 2
    $\begingroup$ My reaction to reading the title was "but of course!" This can happen whenever you employ a statistic whose values (on at least one resample of the data) can be "illogical," such as lying in an impossible range of the estimand. One example of such a statistic would be when estimating a component of variance by subtracting one estimated variance from another. You might elect to keep negative results in the spirit of yielding an (approximately) unbiased estimator. $\endgroup$
    – whuber
    Commented yesterday

2 Answers 2

11
$\begingroup$

There are a few kinds of bootstrap confidence intervals, and it appears you're using the percentile method. Yes, the percentile bootstrap confidence intervals will never cover infeasible parameter space (assuming the statistic you're using doesn't ever become infeasible. For example, the sample mean can't be negative if all the data are non-negative). This is because the point estimate can never be infeasible, and the percentile method calculates the confidence bounds from the bootstrapped point estimates.

Just as there are a few bootstrap confidence intervals, so too are there several binomial confidence intervals. If you are studying a rare outcome, it may be advantageous to use something like a Wilson Interval or a Clopper Pearson interval.

Similarly, for a random variable supported on the positive reals -- such as an exponential random variable -- it may make sense to calculate the confidence interval in log space and then transform the interval via the exponential.

Bootstrapping is a great invention, but analytic results extend beyond the Wald interval, especially for cases which are bounded like the ones you provide. These analytic results can have good coverage in some circumstances, and the percentile confidence interval is by no means the best of the bootstrap confidence intervals.

$\endgroup$
3
  • 2
    $\begingroup$ (+1) Even though "percentile bootstrap confidence intervals will never cover infeasible parameter space," they might well be "illogical" in a more limited sense: the confidence intervals might not even cover the true value of a statistic, if the estimator is biased. See this 10-year-old question for the case of Shannon entropy. $\endgroup$
    – EdM
    Commented yesterday
  • $\begingroup$ @EdM In your opinion, is the fact that the confidence intervals might not even cover the true value of a statistic a problem with the bootstrap, or the estimator? Could I use an unbiased estimator and still not have the bootstrap CI cover the estimand (systematically, not a false positive as an example). $\endgroup$ Commented yesterday
  • $\begingroup$ As far as I understand (which unfortunately isn't really that far), the problem is with a biased estimator. If there's no bias or skewness in the quantity being estimated and the quantity is pivotal, then I understand that most any bootstrapping should be OK. The BCa bootstrap should help if there is bias/skewness. $\endgroup$
    – EdM
    Commented yesterday
5
$\begingroup$

I will just add a few comments, on the margins, to @DemetriPananos' very good answer.

There are many (really, many) methods for computing a binomial proportion CI; the Wald method is about as poor as it gets (the first section of Wikipedia's article on binomial CI's is titled "Problems with using a normal approximation or "Wald interval"". A normal is a very poor approximation for a binomial when p is close to 0, or 1. This is well known, and is a reason why many other methods came about (Wilson, Agresti-Coull, Jeffreys, Clopper-Pearson, Blaker, etc.). Wald had a place in days before computers; I am not sure why it is still being taught today? Clopper Pearson, or Blaker, should be the default (it used to be painful to compute by hand, in part because of factorials).

I also want to address your final sentence; "Now with modern computers where simulation is not a problem, wont bootstrapping CI almost always be better (ie avoid illogical problem) if not same as Wald?". Bootstrapping is certainly a major advance for statistics, but it is not a "silver bullet". The major assumption for any of the many bootstrapping methods to provide "reasonable" answers is that the sample be representative of the population. This is fundamentally untestable (since we do not know the population's distribution), so it really means that the sample size should be large, and typically larger than what would be required for parametric methods.

$\endgroup$
6
  • 2
    $\begingroup$ For what it's worth, Frank Harrell prefers the Wilson CI for binomial proportions over the alternatives (see ?Hmisc::binconf in R) and this is also the only preferred method within the UK civil service (or at least some parts of it I'm familiar with - it's a big organisation!). $\endgroup$
    – Silverfish
    Commented yesterday
  • $\begingroup$ People who prefer the Wilson CI give the following references in support of their case: Agresti, A., and Coull, B. (1998). Approximate is better than ‘exact’ for interval estimation of binomial proportions. Am Stat 52: 119–126; Newcombe, R. (1998). Two-sided confidence intervals for the single proportion: Comparison of seven methods. Stat Med 17: 857–872; Newcombe, R., and Altman, D. (2000). Proportions and their differences (in statistics with confidence) (BMJ books). $\endgroup$
    – Silverfish
    Commented yesterday
  • 1
    $\begingroup$ As for why the Wald interval is still taught - and I'm afraid I've taught many different syllabuses where it was compulsory and no other method was to be seen! - the ease of computation is still a selling point, particularly when students take pen and paper exams with calculator not computer. It was a topic in A-Level Mathematics / Further Mathematics (final year of high school) for example, but many undergraduate and even graduate exams follow a similar format. It's also well suited for approximate mental computation, explaining how margin of error in polls depends on $n$ (and $p$), etc. $\endgroup$
    – Silverfish
    Commented yesterday
  • $\begingroup$ @Silverfish, thanks for the additional information; very informative. I will definitively take a look at the "Approximate is better than ‘exact’" paper; provocative title for sure :-). I know many find the "exact" binomial "too conservative", and I never got why one would want more than $\alpha$% Type I errors when he null is true? It is not as if Type I errors disappear the moment the null is "barely false"; they just become what some have called Type III errors (or A Gelman has called Type S and Type M errors). Thanks again $\endgroup$
    – jginestet
    Commented yesterday
  • $\begingroup$ I believe that's the paper where the Agresti–Coull interval was introduced - which is somewhat curious, given that it's frequently cited in support of the Wilson interval! $\endgroup$
    – Silverfish
    Commented yesterday

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.