Inference Module#

The causalkit.inference package provides statistical inference tools for causal analysis across several estimands:

  • ATT: Average Treatment effect on the Treated

  • ATE: Average Treatment Effect

  • CATE: Conditional Average Treatment Effect (per-observation signals)

  • GATE: Grouped Average Treatment Effects

Overview#

At a glance:

  • Simple tests for A/B outcomes (t-test, two-proportion z-test)

  • DoubleML-based estimators for ATE and ATT

  • DoubleML-based CATE signals and GATE grouping/intervals

API Reference (Package)#

ttest

Perform a t-test on a CausalData object to compare the outcome variable between treated (T=1) and control (T=0) groups.

conversion_z_test

Perform a two-proportion z-test on a CausalData object with a binary outcome (conversion).

bootstrap_diff_means

Bootstrap inference for difference in means between treated (T=1) and control (T=0).

dml

Estimate average treatment effects using DoubleML's interactive regression model (IRM).

causalforestdml

Estimate average treatment effects using EconML's CausalForestDML.

dml_att

Estimate average treatment effects on the treated using DoubleML's interactive regression model (IRM).

cate_esimand

Estimate per-observation CATEs using DoubleML IRM and return a DataFrame with a new 'cate' column.

gate_esimand

Estimate Group Average Treatment Effects (GATEs) by grouping observations using CATE-based quantiles unless custom groups are provided.

ATT Utilities#

T-test inference for causaldata objects (ATT context).

causalkit.inference.att.ttest.ttest(data, confidence_level=0.95)[source]#

Perform a t-test on a CausalData object to compare the outcome variable between treated (T=1) and control (T=0) groups. Returns differences and confidence intervals.

Parameters:
  • data (CausalData) – The CausalData object containing treatment and outcome variables.

  • confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - p_value: The p-value from the t-test - absolute_difference: The absolute difference between treatment and control means - absolute_ci: Tuple of (lower, upper) bounds for the absolute difference confidence interval - relative_difference: The relative difference (percentage change) between treatment and control means - relative_ci: Tuple of (lower, upper) bounds for the relative difference confidence interval

Return type:

Dict[str, Any]

Raises:

ValueError – If the CausalData object doesn’t have both treatment and outcome variables defined, or if the treatment variable is not binary.

Two-proportion z-test for conversion data in CausalData (ATT context).

Compares conversion rates between treated (T=1) and control (T=0) groups. Returns p-value, absolute/relative differences, and their confidence intervals (similar structure to inference.att.ttest).

causalkit.inference.att.conversion_z_test.conversion_z_test(data, confidence_level=0.95)[source]#

Perform a two-proportion z-test on a CausalData object with a binary outcome (conversion).

Parameters:
  • data (CausalData) – The CausalData object containing treatment and outcome variables.

  • confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - p_value: Two-sided p-value from the z-test - absolute_difference: Difference in conversion rates (treated - control) - absolute_ci: Tuple (lower, upper) for the absolute difference CI - relative_difference: Percentage change relative to control rate - relative_ci: Tuple (lower, upper) for the relative difference CI

Return type:

Dict[str, Any]

Raises:

ValueError – If treatment/outcome are missing, treatment is not binary, outcome is not binary, groups are empty, or confidence_level is outside (0, 1).

DoubleML implementation for estimating average treatment effects on the treated.

This module provides a function to estimate average treatment effects on the treated using DoubleML.

causalkit.inference.att.dml_att.dml_att(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, confidence_level=0.95)[source]#

Estimate average treatment effects on the treated using DoubleML’s interactive regression model (IRM).

Parameters:
  • data (CausalData) – The causaldata object containing treatment, target, and confounders variables.

  • ml_g (estimator, optional) – A machine learner implementing fit() and predict() methods for the nuisance function g_0(D,X) = E[Y|X,D]. If None, a CatBoostRegressor configured to use all CPU cores is used.

  • ml_m (classifier, optional) – A machine learner implementing fit() and predict_proba() methods for the nuisance function m_0(X) = E[D|X]. If None, a CatBoostClassifier configured to use all CPU cores is used.

  • n_folds (int, default 5) – Number of folds for cross-fitting.

  • n_rep (int, default 1) – Number of repetitions for the sample splitting.

  • confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - coefficient: The estimated average treatment effect on the treated - std_error: The standard error of the estimate - p_value: The p-value for the null hypothesis that the effect is zero - confidence_interval: Tuple of (lower, upper) bounds for the confidence interval - model: The fitted DoubleMLIRM object

Return type:

Dict[str, Any]

Raises:

ValueError – If the causaldata object doesn’t have treatment, target, and confounders variables defined, or if the treatment variable is not binary.

Examples

>>> from causalkit.data import generate_rct_data
>>> from causalkit.data import CausalData
>>> from causalkit.inference.att import dml_att
>>>
>>> # Generate data
>>> df = generate_rct_data()
>>>
>>> # Create causaldata object
>>> ck = CausalData(
...     df=df,
...     outcome='outcome',
...     treatment='treatment',
...     confounders=['age', 'invited_friend']
... )
>>>
>>> # Estimate ATT using DoubleML
>>> results = dml_att(ck)
>>> print(f"ATT: {results['coefficient']:.4f}")
>>> print(f"Standard Error: {results['std_error']:.4f}")
>>> print(f"P-value: {results['p_value']:.4f}")
>>> print(f"Confidence Interval: {results['confidence_interval']}")

ATE Utilities#

DoubleML implementation for estimating average treatment effects.

This module provides a function to estimate average treatment effects using DoubleML.

causalkit.inference.ate.dml_ate.dml_ate(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, score='ATE', confidence_level=0.95)[source]#

Estimate average treatment effects using DoubleML’s interactive regression model (IRM).

Parameters:
  • data (CausalData) – The causaldata object containing treatment, target, and confounders variables.

  • ml_g (estimator, optional) – A machine learner implementing fit() and predict() methods for the nuisance function g_0(D,X) = E[Y|X,D]. If None, a CatBoostRegressor configured to use all CPU cores is used.

  • ml_m (classifier, optional) – A machine learner implementing fit() and predict_proba() methods for the nuisance function m_0(X) = E[D|X]. If None, a CatBoostClassifier configured to use all CPU cores is used.

  • n_folds (int, default 5) – Number of folds for cross-fitting.

  • n_rep (int, default 1) – Number of repetitions for the sample splitting.

  • score (str, default "ATE") – A str (“ATE” or “ATTE”) specifying the score function.

  • confidence_level (float, default 0.95) – The confidence level for calculating confidence intervals (between 0 and 1).

Returns:

A dictionary containing: - coefficient: The estimated average treatment effect - std_error: The standard error of the estimate - p_value: The p-value for the null hypothesis that the effect is zero - confidence_interval: Tuple of (lower, upper) bounds for the confidence interval - model: The fitted DoubleMLIRM object

Return type:

Dict[str, Any]

Raises:

ValueError – If the causaldata object doesn’t have treatment, target, and confounders variables defined, or if the treatment variable is not binary.

Examples

>>> from causalkit.data import generate_rct_data
>>> from causalkit.data import CausalData
>>> from causalkit.inference.ate import dml_ate
>>>
>>> # Generate data
>>> df = generate_rct_data()
>>>
>>> # Create causaldata object
>>> ck = CausalData(
...     df=df,
...     outcome='outcome',
...     treatment='treatment',
...     confounders=['age', 'invited_friend']
... )
>>>
>>> # Estimate ATE using DoubleML
>>> results = dml_ate(ck)
>>> print(f"ATE: {results['coefficient']:.4f}")
>>> print(f"Standard Error: {results['std_error']:.4f}")
>>> print(f"P-value: {results['p_value']:.4f}")
>>> print(f"Confidence Interval: {results['confidence_interval']}")

CATE Utilities#

DoubleML implementation for estimating CATE (per-observation orthogonal signals).

This module provides a function that, given a CausalData object, fits a DoubleML IRM model and augments the data with a new column ‘cate’ that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).

causalkit.inference.cate.cate_esimand.cate_esimand(data, ml_g=None, ml_m=None, n_folds=5, n_rep=1, use_blp=False, X_new=None)[source]#

Estimate per-observation CATEs using DoubleML IRM and return a DataFrame with a new ‘cate’ column.

Parameters:
  • data (CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders.

  • ml_g (estimator, optional) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None.

  • ml_m (classifier, optional) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None.

  • n_folds (int, default 5) – Number of folds for cross-fitting.

  • n_rep (int, default 1) – Number of repetitions for sample splitting.

  • use_blp (bool, default False) – If True, and X_new is provided, returns cate from obj.blp_predict(X_new) aligned to X_new. If False (default), uses obj._orthogonal_signals (in-sample estimates) and appends to data.

  • X_new (pd.DataFrame, optional) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders in data.

Returns:

If use_blp is False: returns a copy of data.df with a new column ‘cate’. If use_blp is True and X_new is provided: returns a DataFrame with ‘cate’ column for X_new rows.

Return type:

pd.DataFrame

Raises:

ValueError – If treatment is not binary 0/1 or required metadata is missing.

GATE Utilities#

Group Average Treatment Effect (GATE) estimation using DoubleML orthogonal signals.

This module provides a function that, given a (possibly filtered) CausalData object, fits a DoubleML IRM model, computes per-observation CATEs (orthogonal signals), forms groups (by default CATE quintiles), and returns group-level estimates (theta), standard errors, p-values, and confidence intervals.

It prefers DoubleML’s native gate() and confint() methods if available; otherwise falls back to a simple normal approximation using the group mean of orthogonal signals and its standard error.

causalkit.inference.gate.gate_esimand.gate_esimand(data, groups=None, n_groups=5, ml_g=None, ml_m=None, n_folds=5, n_rep=1, confidence_level=0.95)[source]#

Estimate Group Average Treatment Effects (GATEs) by grouping observations using CATE-based quantiles unless custom groups are provided.

Parameters:
  • data (CausalData) – The (possibly filtered) CausalData object. Filtering should be done by subsetting data.df before constructing CausalData, or by preparing a filtered CausalData instance.

  • groups (pd.Series or pd.DataFrame, optional) – Group assignments per observation. If a Series is passed, it will be used as a single column named ‘q’. If a DataFrame, it should contain a single column specifying groups. If None, groups are formed by pd.qcut over the in-sample CATEs into n_groups quantiles labeled 0..n_groups-1.

  • n_groups (int, default 5) – Number of quantile groups if groups is None.

  • ml_g (Optional[Any]) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).

  • ml_m (Optional[Any]) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).

  • n_folds (int) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).

  • n_rep (int) – Learners and DoubleML cross-fitting controls (as in ATE/ATT).

  • confidence_level (float, default 0.95) – Confidence level for two-sided normal-approximation intervals.

Returns:

A DataFrame with columns:
  • group: group label

  • n: group size

  • theta: estimated group average treatment effect

  • std_error: standard error (normal approx if fallback path)

  • p_value: two-sided p-value for H0: theta=0

  • ci_lower, ci_upper: confidence interval bounds

Return type:

pd.DataFrame