Gadget–Beeps

Algorithms for scalable Bayesian learning of causal DAGs.

Developed at the Sums of Products research group at the University of Helsinki. Originally published at NeurIPS 2020 [1]​.

News

2021-05-22. There were some bugs in the implementation of the Beeps algorithm, now corrected in the release version 0.1.2 of Sumu. The bugs did not have an effect on the results in the NeurIPS publication [1]​, as those relied on another, R implementation. This page only serves to provide both algorithms as presented in the paper – for an up to date development version see the Sumu repository.

Installation and use

Both of the algorithms are implemented in Sumu. After installing Sumu version 0.1.2 with the command pip install sumu==0.1.2 (following its installation instructions) you can run the algorithms from command line with gadget-beeps.py, for which the -h flag prints help:

$ python gadget-beeps.py -h
usage: gadget-beeps.py [-h]
					   [-c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}]
					   [-s {bdeu,bge}] [-e ESS] [-m MAX_ID] [-d D]
					   [-b BURN_IN] [-i ITERATIONS] [-n NTH] [-nc N_CHAINS]
					   [-r RANDOMSEED] [-o OUTPUT_PREFIX]
					   datapath K

DESCRIPTION

Input
─────

  A path to a space separated file of either discrete or continuous
  data. No header rows for variable names or arities (in the discrete
  case) are assumed. Discrete data is assumed to be integer encoded;
  continuous data uses "." as decimal separator.

  The data path argument should be followed by the number K of candidate
  parents to use for each node, and additional optional arguments as
  explained in this help.

Output
──────

  Files for:
  • Candidate parents found with the selected algorithm.
  • Gadget sampled DAGs.
  • Beeps estimated causal effects (if ran on continuous data).

Example run
───────────

  $ python gadget-beeps.py cont_data.csv 10 -s bge

References
──────────

  [1] Jussi Viinikka, Antti Hyttinen, Johan Pensar, and Mikko
  Koivisto. Towards Scalable Bayesian Learning of Causal DAGs. In
  NeurIPS 2020, in press.

positional arguments:
  datapath              path to data file
  K                     how many candidate parents to include

optional arguments:
  -h, --help            show this help message and exit
  -c {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}, --candidate-parent-algorithm {opt,top,pc,mb,ges,greedy,greedy-lite,back-forth}
						candidate algorithm to use (default: greedy-lite)
  -s {bdeu,bge}, --score {bdeu,bge}
						score function to use
  -e ESS, --ess ESS     equivalent sample size for BDeu
  -m MAX_ID, --max-id MAX_ID
						maximum indegree for scores (default: no max-indegree)
  -d D                  maximum indegree for psets which are not subsets of
						candidates (default: 2)
  -b BURN_IN, --burn-in BURN_IN
						number of burn-in samples (default: 1000)
  -i ITERATIONS, --iterations ITERATIONS
						number of iterations after burn-in (default: 1000)
  -n NTH, --nth NTH     sample dag every nth iteration (default: 10)
  -nc N_CHAINS, --n-chains N_CHAINS
						number of Metropolis coupled MCMC chains (default: 16)
  -r RANDOMSEED, --randomseed RANDOMSEED
						random seed
  -o OUTPUT_PREFIX, --output-prefix OUTPUT_PREFIX
						path prefix for output files (default: input file
						path)

References