Inference Module#

The causalkit.inference package provides statistical inference tools for causal analysis across several estimands:

ATT: Average Treatment effect on the Treated
ATE: Average Treatment Effect
CATE: Conditional Average Treatment Effect (per-observation signals)
GATE: Grouped Average Treatment Effects

Overview#

At a glance:

Simple tests for A/B outcomes (t-test, two-proportion z-test)
DoubleML-based estimators for ATE and ATT
DoubleML-based CATE signals and GATE grouping/intervals

API Reference (Package)#

`ttest`	Perform a t-test on a CausalData object to compare the outcome variable between treated (T=1) and control (T=0) groups.
`conversion_z_test`	Perform a two-proportion z-test on a CausalData object with a binary outcome (conversion).
`bootstrap_diff_means`	Bootstrap inference for difference in means between treated (T=1) and control (T=0).
`dml`	Estimate average treatment effects using DoubleML's interactive regression model (IRM).
`causalforestdml`	Estimate average treatment effects using EconML's CausalForestDML.
`dml_att`	Estimate average treatment effects on the treated using DoubleML's interactive regression model (IRM).
`cate_esimand`	Estimate per-observation CATEs using DoubleML IRM and return a DataFrame with a new 'cate' column.
`gate_esimand`	Estimate Group Average Treatment Effects (GATEs) by grouping observations using CATE-based quantiles unless custom groups are provided.

ATT Utilities#

T-test inference for causaldata objects (ATT context).

causalkit.inference.att.ttest.ttest(data, confidence_level=0.95)[source]#

Perform a t-test on a CausalData object to compare the outcome variable between treated (T=1) and control (T=0) groups. Returns differences and confidence intervals.

Parameters:

data (CausalData) – The CausalData object containing treatment and outcome variables.
confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - p_value: The p-value from the t-test - absolute_difference: The absolute difference between treatment and control means - absolute_ci: Tuple of (lower, upper) bounds for the absolute difference confidence interval - relative_difference: The relative difference (percentage change) between treatment and control means - relative_ci: Tuple of (lower, upper) bounds for the relative difference confidence interval

Return type:

Dict[str, Any]

Raises:

ValueError – If the CausalData object doesn’t have both treatment and outcome variables defined, or if the treatment variable is not binary.

Two-proportion z-test for conversion data in CausalData (ATT context).

Compares conversion rates between treated (T=1) and control (T=0) groups. Returns p-value, absolute/relative differences, and their confidence intervals (similar structure to inference.att.ttest).

causalkit.inference.att.conversion_z_test.conversion_z_test(data, confidence_level=0.95)[source]#

Perform a two-proportion z-test on a CausalData object with a binary outcome (conversion).

Parameters:

data (CausalData) – The CausalData object containing treatment and outcome variables.
confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - p_value: Two-sided p-value from the z-test - absolute_difference: Difference in conversion rates (treated - control) - absolute_ci: Tuple (lower, upper) for the absolute difference CI - relative_difference: Percentage change relative to control rate - relative_ci: Tuple (lower, upper) for the relative difference CI

Return type:

Dict[str, Any]

Raises:

ValueError – If treatment/outcome are missing, treatment is not binary, outcome is not binary, groups are empty, or confidence_level is outside (0, 1).

DoubleML implementation for estimating average treatment effects on the treated.

This module provides a function to estimate average treatment effects on the treated using DoubleML.

causalkit.inference.att.dml_att.dml_att(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, confidence_level=0.95)[source]#

Estimate average treatment effects on the treated using DoubleML’s interactive regression model (IRM).

Parameters:

data (CausalData) – The causaldata object containing treatment, target, and confounders variables.
ml_g (estimator, optional) – A machine learner implementing fit() and predict() methods for the nuisance function g_0(D,X) = E[Y|X,D]. If None, a CatBoostRegressor configured to use all CPU cores is used.
ml_m (classifier, optional) – A machine learner implementing fit() and predict_proba() methods for the nuisance function m_0(X) = E[D|X]. If None, a CatBoostClassifier configured to use all CPU cores is used.
n_folds (int, default 5) – Number of folds for cross-fitting.
n_rep (int, default 1) – Number of repetitions for the sample splitting.
confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - coefficient: The estimated average treatment effect on the treated - std_error: The standard error of the estimate - p_value: The p-value for the null hypothesis that the effect is zero - confidence_interval: Tuple of (lower, upper) bounds for the confidence interval - model: The fitted DoubleMLIRM object

Return type:

Dict[str, Any]

Raises:

ValueError – If the causaldata object doesn’t have treatment, target, and confounders variables defined, or if the treatment variable is not binary.

Examples

>>> from causalkit.data import generate_rct_data
>>> from causalkit.data import CausalData
>>> from causalkit.inference.att import dml_att
>>>
>>> # Generate data
>>> df = generate_rct_data()
>>>
>>> # Create causaldata object
>>> ck = CausalData(
...     df=df,
...     outcome='outcome',
...     treatment='treatment',
...     confounders=['age', 'invited_friend']
... )
>>>
>>> # Estimate ATT using DoubleML
>>> results = dml_att(ck)
>>> print(f"ATT: {results['coefficient']:.4f}")
>>> print(f"Standard Error: {results['std_error']:.4f}")
>>> print(f"P-value: {results['p_value']:.4f}")
>>> print(f"Confidence Interval: {results['confidence_interval']}")

ATE Utilities#

DoubleML implementation for estimating average treatment effects.

This module provides a function to estimate average treatment effects using DoubleML.

causalkit.inference.ate.dml_ate.dml_ate(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, score='ATE', confidence_level=0.95)[source]#

Estimate average treatment effects using DoubleML’s interactive regression model (IRM).

Parameters:

data (CausalData) – The causaldata object containing treatment, target, and confounders variables.
ml_g (estimator, optional) – A machine learner implementing fit() and predict() methods for the nuisance function g_0(D,X) = E[Y|X,D]. If None, a CatBoostRegressor configured to use all CPU cores is used.
ml_m (classifier, optional) – A machine learner implementing fit() and predict_proba() methods for the nuisance function m_0(X) = E[D|X]. If None, a CatBoostClassifier configured to use all CPU cores is used.
n_folds (int, default 5) – Number of folds for cross-fitting.
n_rep (int, default 1) – Number of repetitions for the sample splitting.
score (str, default "ATE") – A str (“ATE” or “ATTE”) specifying the score function.
confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - coefficient: The estimated average treatment effect - std_error: The standard error of the estimate - p_value: The p-value for the null hypothesis that the effect is zero - confidence_interval: Tuple of (lower, upper) bounds for the confidence interval - model: The fitted DoubleMLIRM object

Return type:

Dict[str, Any]

Raises:

ValueError – If the causaldata object doesn’t have treatment, target, and confounders variables defined, or if the treatment variable is not binary.

Examples

>>> from causalkit.data import generate_rct_data
>>> from causalkit.data import CausalData
>>> from causalkit.inference.ate import dml_ate
>>>
>>> # Generate data
>>> df = generate_rct_data()
>>>
>>> # Create causaldata object
>>> ck = CausalData(
...     df=df,
...     outcome='outcome',
...     treatment='treatment',
...     confounders=['age', 'invited_friend']
... )
>>>
>>> # Estimate ATE using DoubleML
>>> results = dml_ate(ck)
>>> print(f"ATE: {results['coefficient']:.4f}")
>>> print(f"Standard Error: {results['std_error']:.4f}")
>>> print(f"P-value: {results['p_value']:.4f}")
>>> print(f"Confidence Interval: {results['confidence_interval']}")

CATE Utilities#

DoubleML implementation for estimating CATE (per-observation orthogonal signals).

This module provides a function that, given a CausalData object, fits a DoubleML IRM model and augments the data with a new column ‘cate’ that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).

causalkit.inference.cate.cate_esimand.cate_esimand(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, use_blp=False, X_new=None)[source]#

Estimate per-observation CATEs using DoubleML IRM and return a DataFrame with a new ‘cate’ column.

Parameters:

data (CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders.
ml_g (estimator, optional) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None.
ml_m (classifier, optional) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None.
n_folds (int, default 5) – Number of folds for cross-fitting.
n_rep (int, default 1) – Number of repetitions for sample splitting.
use_blp (bool, default False) – If True, and X_new is provided, returns cate from obj.blp_predict(X_new) aligned to X_new. If False (default), uses obj._orthogonal_signals (in-sample estimates) and appends to data.
X_new (pd.DataFrame, optional) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders in data.

Returns:

If use_blp is False: returns a copy of data.df with a new column ‘cate’. If use_blp is True and X_new is provided: returns a DataFrame with ‘cate’ column for X_new rows.

Return type:

pd.DataFrame

Raises:

ValueError – If treatment is not binary 0/1 or required metadata is missing.

GATE Utilities#

Group Average Treatment Effect (GATE) estimation using DoubleML orthogonal signals.

This module provides a function that, given a (possibly filtered) CausalData object, fits a DoubleML IRM model, computes per-observation CATEs (orthogonal signals), forms groups (by default CATE quintiles), and returns group-level estimates (theta), standard errors, p-values, and confidence intervals.

It prefers DoubleML’s native gate() and confint() methods if available; otherwise falls back to a simple normal approximation using the group mean of orthogonal signals and its standard error.

causalkit.inference.gate.gate_esimand.gate_esimand(data, groups=None, n_groups=5, ml_g=None, ml_m=None, n_folds=5, n_rep=1, confidence_level=0.95)[source]#

Estimate Group Average Treatment Effects (GATEs) by grouping observations using CATE-based quantiles unless custom groups are provided.

Parameters:

data (CausalData) – The (possibly filtered) CausalData object. Filtering should be done by subsetting data.df before constructing CausalData, or by preparing a filtered CausalData instance.
groups (pd.Series or pd.DataFrame, optional) – Group assignments per observation. If a Series is passed, it will be used as a single column named ‘q’. If a DataFrame, it should contain a single column specifying groups. If None, groups are formed by pd.qcut over the in-sample CATEs into n_groups quantiles labeled 0..n_groups-1.
n_groups (int, default 5) – Number of quantile groups if groups is None.
ml_g (Optional[Any]) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).
ml_m (Optional[Any]) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).
n_folds (int) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).
n_rep (int) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).
confidence_level (float, default 0.95) – Confidence level for two-sided normal-approximation intervals.

Returns:

A DataFrame with columns:

group: group label
n: group size
theta: estimated group average treatment effect
std_error: standard error (normal approx if fallback path)
p_value: two-sided p-value for H0: theta=0
ci_lower, ci_upper: confidence interval bounds

Return type:

pd.DataFrame