bootstrap-stat#

Bootstrap-stat implements bootstrap methods for statistical inference, following Efron & Tibshirani’s An Introduction to the Bootstrap [ET93]. The bootstrap provides distribution-free estimates of standard errors, confidence intervals, and significance levels by resampling the observed data rather than assuming a parametric model.

The library provides:

Standard errors: bootstrap, jackknife, and infinitesimal jackknife
Confidence intervals: percentile, BCa (bias-corrected and accelerated), ABC, bootstrap-t, and calibrated
Bias estimation and correction
Significance testing: achieved significance levels and power analysis
Prediction error: optimism method, .632 and .632+ bootstrap

Quickstart#

The core workflow: wrap observed data in an EmpiricalDistribution, define a statistic function, then call whichever inference function you need. bcanon_interval() is the recommended default for confidence intervals.

import numpy as np
import bootstrap_stat as bp

# Law school data: LSAT/GPA for n=15 schools [ET93], Table 3.2
df = bp.law_data()

def correlation(df):
    return np.corrcoef(df["LSAT"], df["GPA"])[0, 1]

dist = bp.EmpiricalDistribution(df)
theta_hat = correlation(df)
lo, hi = bp.bcanon_interval(dist, correlation, theta_hat)
# 90% BCa confidence interval for the correlation

The methods in this library are described in detail in [ET93]. The .632+ bootstrap is from [ET97]; [Koh95] and [Arl10] discuss its limitations relative to cross-validation. Full citations are on the References page.

Contents:

Indices and tables#