Playing with Bayesian Causal Graphs.
The main references have so far been the following:
References
References for the dowhy package:
Material which helped my introductory reading so far by :
- degeneratestate.org: Pt I - potential outcomes (contains definitions of ATE, ATT and ATC, among other things)
- degeneratestate.org: Pt II - causal graphs & the backdoor criterion
- this youtube video on backdoor paths
- Rubin et al. 2005 on causal inference and potential outcomes
- wiki article on the average treatment effect (ATE)
- this wiki article on propensity score matching
Brief list of concepts I found useful:
- at the center of interest is the calculation of potential outcomes / counterfactuals $\rightarrow$ what Y would have been if X would have been different (so everything after all observations are done) $\rightarrow$ so we may want to know if a certain drug was taken or a placebo and measure the difference in effect
- potential outcomes and counterfactuals can be seen as being the same thing, see Rubin et al. 2005 (really more of a useful thing know when reading the literature - man was I confused before knowing this)
- potential outcomes can be calculated without using a graphical model but graphical models help by guiding which variables should be conditioned on legally
- graphical models themselves require justification
Technical note:
In order to run this notebook, you'll need a symbolic link from /notebooks/bcg
to ./bcg
.
n = 1000
a,b,c = 1.5, 1., 0
target = 'Y'
obs = pd.DataFrame(columns=['X0', 'X1', target])
obs['X0'] = stats.uniform.rvs(loc=-1, scale=2, size=n)
# obs['X1'] = stats.norm.rvs(loc=0, scale=.1, size=n)
obs['X1'] = stats.uniform.rvs(loc=-.5, scale=1, size=n)
obs[target] = a * obs.X0 ** b + c + obs.X1
obs.head()
plot_target_vs_rest(obs)
plot_var_hists(obs)
show_correlations(obs)
gg = GraphGenerator(obs)
print(f'target var: {gg.target}, not target vars: {", ".join(gg.not_targets)}')
g = gg.get_only_Xi_to_Y()
gml = gg.get_gml(g)
gg.vis_g(g)
treatment = ['X0', ] # 'X1'
causal_model = dw.CausalModel(data=obs, treatment=treatment,
outcome=target, graph=gml)
Note how CausalModel
added an unobserved confounder variable U
causal_model.view_model()
identified_estimand = causal_model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)
method_name = 'backdoor.linear_regression'
effect_kwargs = dict(
method_name=method_name,
control_value = 0,
treatment_value = 1,
target_units = 'ate',
test_significance = True
)
causal_estimate = causal_model.estimate_effect(identified_estimand,
**effect_kwargs)
print(causal_estimate)
method_name = 'placebo_treatment_refuter'
refute_kwargs = dict(
method_name=method_name,
placebo_type = "permute", # relevant for placebo refutation
)
refute_res = causal_model.refute_estimate(identified_estimand,
causal_estimate,
**refute_kwargs)
print(refute_res)