Variance Reduction¶
Lower variance means smaller required sample sizes and faster experiments. splita provides 14 variance reduction methods.
CUPED¶
Controlled-experiment Using Pre-Experiment Data. The most widely used variance reduction technique at companies like Microsoft, Booking.com, and Netflix.
from splita.variance import CUPED
from splita import Experiment
import numpy as np
rng = np.random.default_rng(42)
# Pre-experiment data (e.g., last week's page views)
pre_ctrl = rng.normal(10, 2, size=1000)
pre_trt = rng.normal(10, 2, size=1000)
# Experiment data (correlated with pre-experiment)
ctrl = pre_ctrl + rng.normal(0, 1, 1000)
trt = pre_trt + 0.5 + rng.normal(0, 1, 1000)
# Apply CUPED
cuped = CUPED()
ctrl_adj, trt_adj = cuped.fit_transform(ctrl, trt, pre_ctrl, pre_trt)
print(f"Variance reduction: {cuped.variance_reduction_:.0%}") # ~75%
# Run the test on adjusted data
result = Experiment(ctrl_adj, trt_adj).run()
Note
CUPED works best when the pre-experiment covariate is highly correlated with the outcome. Typical reductions range from 30-80%.
CUPAC¶
ML-predicted covariate adjustment. Uses cross-validated ML models to predict the outcome from covariates, then adjusts using the predictions. Requires pip install splita[ml].
from splita.variance import CUPAC
import numpy as np
rng = np.random.default_rng(42)
n = 2000
# Multiple covariates as a feature matrix
X_ctrl = rng.normal(0, 1, (n, 5))
X_trt = rng.normal(0, 1, (n, 5))
ctrl = X_ctrl @ rng.normal(1, 0.5, 5) + rng.normal(0, 1, n)
trt = X_trt @ rng.normal(1, 0.5, 5) + 0.5 + rng.normal(0, 1, n)
cupac = CUPAC()
ctrl_adj, trt_adj = cupac.fit_transform(ctrl, trt, X_ctrl, X_trt)
print(f"Variance reduction: {cupac.variance_reduction_:.0%}")
OutlierHandler¶
Outliers inflate variance. Handle them before analysis.
from splita.variance import OutlierHandler
handler = OutlierHandler(method='winsorize')
ctrl_clean, trt_clean = handler.fit_transform(ctrl, trt)
Available methods:
| Method | Description |
|---|---|
'winsorize' |
Cap extreme values at percentile thresholds |
'trim' |
Remove extreme values entirely |
'iqr' |
Cap based on IQR fences |
Warning
Always fit on pooled data (both groups together) to avoid introducing bias. fit_transform() handles this automatically.
MultivariateCUPED¶
Extension of CUPED for multiple covariates without requiring ML:
from splita.variance import MultivariateCUPED
import numpy as np
rng = np.random.default_rng(42)
n = 1000
ctrl = rng.normal(25, 8, n)
trt = rng.normal(26, 8, n)
# Multiple pre-experiment covariates
pre_ctrl = rng.normal(0, 1, (n, 3))
pre_trt = rng.normal(0, 1, (n, 3))
mcuped = MultivariateCUPED()
ctrl_adj, trt_adj = mcuped.fit_transform(ctrl, trt, pre_ctrl, pre_trt)
AdaptiveWinsorizer¶
Automatically finds the optimal capping thresholds via grid search:
from splita.variance import AdaptiveWinsorizer
winsorizer = AdaptiveWinsorizer()
ctrl_clean, trt_clean = winsorizer.fit_transform(ctrl, trt)
print(f"Optimal lower: {winsorizer.lower_percentile_}")
print(f"Optimal upper: {winsorizer.upper_percentile_}")
RegressionAdjustment¶
Lin's regression adjustment with HC2 robust standard errors:
from splita.variance import RegressionAdjustment
import numpy as np
rng = np.random.default_rng(42)
n = 500
ctrl = rng.normal(25, 8, n)
trt = rng.normal(26, 8, n)
x_ctrl = rng.normal(0, 1, n)
x_trt = rng.normal(0, 1, n)
ra = RegressionAdjustment()
result = ra.fit(ctrl, trt, x_ctrl, x_trt)
print(result.ate) # adjusted treatment effect
print(result.ci) # confidence interval
print(result.pvalue)
DoubleML¶
Double/debiased machine learning for treatment effect estimation:
from splita.variance import DoubleML
import numpy as np
rng = np.random.default_rng(42)
n = 1000
X_ctrl = rng.normal(0, 1, (n, 5))
X_trt = rng.normal(0, 1, (n, 5))
ctrl = X_ctrl @ rng.normal(1, 0.5, 5) + rng.normal(0, 1, n)
trt = X_trt @ rng.normal(1, 0.5, 5) + 0.5 + rng.normal(0, 1, n)
dml = DoubleML()
result = dml.fit(ctrl, trt, X_ctrl, X_trt)
print(result.ate)
print(result.ci)
The auto() function¶
The top-level auto() function automatically selects and applies the best variance reduction strategy:
from splita import auto
import numpy as np
rng = np.random.default_rng(42)
ctrl = rng.normal(25, 8, 1000)
trt = rng.normal(26, 8, 1000)
pre_ctrl = rng.normal(10, 2, 1000)
pre_trt = rng.normal(10, 2, 1000)
result = auto(ctrl, trt, pre_control=pre_ctrl, pre_treatment=pre_trt)
print(result)
All 14 methods¶
| Class | Description | When to use |
|---|---|---|
CUPED |
Pre-experiment covariate adjustment | Single covariate, most common |
CUPAC |
ML-predicted adjustment | Multiple covariates, non-linear relationships |
OutlierHandler |
Winsorize/trim/IQR outliers | Heavy-tailed metrics (revenue) |
MultivariateCUPED |
Multi-covariate CUPED | Multiple covariates, no ML needed |
RegressionAdjustment |
Lin's OLS with HC2 SEs | Linear covariate relationships |
AdaptiveWinsorizer |
Auto-tuned capping | When you don't know the right percentiles |
DoubleML |
Double/debiased ML | High-dimensional confounders |
ClusterBootstrap |
Cluster-level bootstrap | Within-cluster correlation |
InExperimentVR |
In-experiment control covariates | No pre-experiment data available |
NonstationaryAdjustment |
Time-series decomposition | Non-stationary treatment effects |
PostStratification |
Post-experiment stratification | Known population strata |
PredictionPoweredInference |
ML predictions + small labels | Limited labeled data |
RobustMeanEstimator |
Huber/Catoni/MoM estimators | Extremely heavy tails |
TrimmedMeanEstimator |
Symmetric tail trimming | Symmetric outlier distributions |