Gadget–Beeps
Algorithms for scalable Bayesian learning of causal DAGs.
- Gadget (Generating Acyclic DiGraphs Efficiently from Target) for scalable sampling of directed acyclic graphs.
- Beeps (Bayesian Estimation of Effect Posterior by Sampling) for Bayesian estimation of linear causal effects.
Developed at the Sums of Products research group at the University of Helsinki. Originally published at NeurIPS 2020 [1].
News
2021-05-22. There were some bugs in the implementation of the Beeps algorithm, now corrected in the release version 0.1.2 of Sumu. The bugs did not have an effect on the results in the NeurIPS publication [1], as those relied on another, R implementation. This page only serves to provide both algorithms as presented in the paper – for an up to date development version see the Sumu repository.
Installation and use
Both of the algorithms are implemented in Sumu. After installing Sumu version 0.1.2 with the command pip install sumu==0.1.2
(following its installation instructions) you can run the algorithms from command line with gadget-beeps.py, for which the -h
flag prints help:
$ python gadget-beeps.py -h usage: gadget-beeps.py [-h] [-c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}] [-s {bdeu,bge}] [-e ESS] [-m MAX_ID] [-d D] [-b BURN_IN] [-i ITERATIONS] [-n NTH] [-nc N_CHAINS] [-r RANDOMSEED] [-o OUTPUT_PREFIX] datapath K DESCRIPTION Input ───── A path to a space separated file of either discrete or continuous data. No header rows for variable names or arities (in the discrete case) are assumed. Discrete data is assumed to be integer encoded; continuous data uses "." as decimal separator. The data path argument should be followed by the number K of candidate parents to use for each node, and additional optional arguments as explained in this help. Output ────── Files for: • Candidate parents found with the selected algorithm. • Gadget sampled DAGs. • Beeps estimated causal effects (if ran on continuous data). Example run ─────────── $ python gadget-beeps.py cont_data.csv 10 -s bge References ────────── [1] Jussi Viinikka, Antti Hyttinen, Johan Pensar, and Mikko Koivisto. Towards Scalable Bayesian Learning of Causal DAGs. In NeurIPS 2020, in press. positional arguments: datapath path to data file K how many candidate parents to include optional arguments: -h, --help show this help message and exit -c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}, --candidate-parent-algorithm {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth} candidate algorithm to use (default: greedy-lite) -s {bdeu,bge}, --score {bdeu,bge} score function to use -e ESS, --ess ESS equivalent sample size for BDeu -m MAX_ID, --max-id MAX_ID maximum indegree for scores (default: no max-indegree) -d D maximum indegree for psets which are not subsets of candidates (default: 2) -b BURN_IN, --burn-in BURN_IN number of burn-in samples (default: 1000) -i ITERATIONS, --iterations ITERATIONS number of iterations after burn-in (default: 1000) -n NTH, --nth NTH sample dag every nth iteration (default: 10) -nc N_CHAINS, --n-chains N_CHAINS number of Metropolis coupled MCMC chains (default: 16) -r RANDOMSEED, --randomseed RANDOMSEED random seed -o OUTPUT_PREFIX, --output-prefix OUTPUT_PREFIX path prefix for output files (default: input file path)