Analysis Module#
The causalkit.analysis
module provides statistical analysis tools for causal inference.
Overview#
This module includes functions for:
Performing t-tests on causaldata objects to compare target variables between treatment groups
Calculating p-values, absolute differences, and relative differences with confidence intervals
T-Test Analysis#
The ttest
function performs a t-test on a causaldata object to compare the target variable between treatment groups. This is particularly useful for analyzing the results of A/B tests or randomized controlled trials (RCTs).
Key Features#
Compares means between treatment and control groups
Calculates p-values to determine statistical significance
Provides absolute difference between group means with confidence intervals
Calculates relative difference (percentage change) with confidence intervals
Supports customizable confidence levels
When to Use T-Tests#
T-tests are appropriate when:
You have a binary treatment variable (e.g., control vs. treatment)
Your target variable is continuous or binary
You want to determine if there’s a statistically significant difference between groups
You need to quantify the magnitude of the effect with confidence intervals
Example Usage#
from causalkit.data import generate_rct_data, CausalData
from causalkit.inference import ttest
# Generate sample RCT data
df = generate_rct_data(
n_users=10000,
split=0.5,
target_type="normal",
target_params={"mean": {"A": 10.0, "B": 10.5}, "std": 2.0},
random_state=42
)
# Create causaldata object
ck = CausalData(
df=df,
outcome='outcome',
treatment='treatment'
)
# Perform t-test with 95% confidence level
results = ttest(ck, confidence_level=0.95)
# Print results
print(f"P-value: {results['p_value']:.4f}")
print(f"Absolute difference: {results['absolute_difference']:.4f}")
print(f"Absolute CI: {results['absolute_ci']}")
print(f"Relative difference: {results['relative_difference']:.2f}%")
print(f"Relative CI: {results['relative_ci']}")
Interpreting Results#
p-value: Indicates the probability of observing the data if there is no true difference between groups. A small p-value (typically < 0.05) suggests that the observed difference is statistically significant.
absolute_difference: The raw difference between the treatment and control means.
absolute_ci: Confidence interval for the absolute difference. If this interval does not include zero, the difference is statistically significant.
relative_difference: The percentage change relative to the control group mean.
relative_ci: Confidence interval for the relative difference.
API Reference#
T-test inference for causaldata objects (ATT context).