Gadget–Beeps
Algorithms for scalable Bayesian learning of causal DAGs.
- Gadget (Generating Acyclic DiGraphs Efficiently from Target) for scalable sampling of directed acyclic graphs.
- Beeps (Bayesian Estimation of Effect Posterior by Sampling) for Bayesian estimation of linear causal effects.
Developed at the Sums of Products research group at the University of Helsinki. Originally published at NeurIPS 2020 [1].
News
2021-05-22. There were some bugs in the implementation of the Beeps algorithm, now corrected in the release version 0.1.2 of Sumu. The bugs did not have an effect on the results in the NeurIPS publication [1], as those relied on another, R implementation. This page only serves to provide both algorithms as presented in the paper – for an up to date development version see the Sumu repository.
Installation and use
Both of the algorithms are implemented in Sumu. After installing Sumu version 0.1.2 with the command pip install sumu==0.1.2 (following its installation instructions) you can run the algorithms from command line with gadget-beeps.py, for which the -h flag prints help:
$ python gadget-beeps.py -h
usage: gadget-beeps.py [-h]
[-c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}]
[-s {bdeu,bge}] [-e ESS] [-m MAX_ID] [-d D]
[-b BURN_IN] [-i ITERATIONS] [-n NTH] [-nc N_CHAINS]
[-r RANDOMSEED] [-o OUTPUT_PREFIX]
datapath K
DESCRIPTION
Input
─────
A path to a space separated file of either discrete or continuous
data. No header rows for variable names or arities (in the discrete
case) are assumed. Discrete data is assumed to be integer encoded;
continuous data uses "." as decimal separator.
The data path argument should be followed by the number K of candidate
parents to use for each node, and additional optional arguments as
explained in this help.
Output
──────
Files for:
• Candidate parents found with the selected algorithm.
• Gadget sampled DAGs.
• Beeps estimated causal effects (if ran on continuous data).
Example run
───────────
$ python gadget-beeps.py cont_data.csv 10 -s bge
References
──────────
[1] Jussi Viinikka, Antti Hyttinen, Johan Pensar, and Mikko
Koivisto. Towards Scalable Bayesian Learning of Causal DAGs. In
NeurIPS 2020, in press.
positional arguments:
datapath path to data file
K how many candidate parents to include
optional arguments:
-h, --help show this help message and exit
-c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}, --candidate-parent-algorithm {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}
candidate algorithm to use (default: greedy-lite)
-s {bdeu,bge}, --score {bdeu,bge}
score function to use
-e ESS, --ess ESS equivalent sample size for BDeu
-m MAX_ID, --max-id MAX_ID
maximum indegree for scores (default: no max-indegree)
-d D maximum indegree for psets which are not subsets of
candidates (default: 2)
-b BURN_IN, --burn-in BURN_IN
number of burn-in samples (default: 1000)
-i ITERATIONS, --iterations ITERATIONS
number of iterations after burn-in (default: 1000)
-n NTH, --nth NTH sample dag every nth iteration (default: 10)
-nc N_CHAINS, --n-chains N_CHAINS
number of Metropolis coupled MCMC chains (default: 16)
-r RANDOMSEED, --randomseed RANDOMSEED
random seed
-o OUTPUT_PREFIX, --output-prefix OUTPUT_PREFIX
path prefix for output files (default: input file
path)