A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning

Welcome to this repository containing the code for our paper:

Tec, M., Duan, Y. and Müller, P., 2022. A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning. To appear in: The American Statistician.

The paper develops two examples of sequential stopping problems inspired by adaptive clinical trials, illustrating the common elements to Bayesian Sequential Design (BSD) and Reinforcement Learning (RL). In addition, all the figures in the paper are reproducible using the rmarkdown notebooks in the folder notebooks.

The outputs of each method (BSD or RL) are stored in the folder results/{method}_ex{example number} folder.

Organization of the Code

For convenience, the code is divided in in scripts to run the BSD and RL algorithms separately. The examples in BSD are coded in R and the RL examples are coded in Python. This is so because we wanted to use standard tools and libraries from each literature. For instance, the RL examples are coded using Stable Baselines 3, a popular Python package for RL. Please review the paper for explanations of Example 1 and Example 2 (and to learn more about BSD and RL! =D)

BSD

For implementing constrained backward induction in Example 1 and the parametric boundaries method in Example 2 run:

Rscript --vanilla bsd_ex1.R
Rscript --vanilla bsd_ex2.R

The outputs contain diagnostic plots and data of the model output; tracedata.csv and uhatdata.csv are used by the rmarkdown notebooks to reproduce the paper figures.

Running both examples should take about 15 min on a regular laptop.

RL

For RL, the script rl_train.py is the entry point to run both examples:

python rl_train.py --env=ex1 --algo=dqn
python rl_train.py --env=ex2 --algo=ppo

The file rl_envs.py encodes Examples 1 and 2 as RL "environments" using the standard OpenAI Gym API. In this way, we can use the implementations of DQN and PPO in StableBaselines3.

Running both examples should take about 2 hours on a regular laptop.

Reproducing paper figures

After executing the instructions above, run the Rmarkdown notebooks plots_ex1.Rmd and plots_ex2.Rmd. The easiest thing is to run them using Rstudio. Alternatively,

cd notebooks
Rscript --vanilla -e "rmarkdown::render('plots_ex1.Rmd')"
Rscript --vanilla -e "rmarkdown::render('plots_ex2.Rmd')"

Software requirements

The code was developed using R 4.0 and Python 3.9 with minimum software requirements. Here's a list of the required packages.

R

tidyverse
cowplot
latex2exp
MASS
rmarkdown (for the paper figures)
reticulate (for interoperability with Python, but only used for the plots)

Python

numpy
torch (deep learning engine)
stable_baselines3 (reinforcement learning algorithms)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bsd_ex1.R		bsd_ex1.R
bsd_ex2.R		bsd_ex2.R
rl_envs.py		rl_envs.py
rl_train.py		rl_train.py
rl_utils.py		rl_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning

Organization of the Code

BSD

RL

Reproducing paper figures

Software requirements

About

Releases

Packages

Contributors 2

Languages

License

mauriciogtec/bsd-and-rl

Folders and files

Latest commit

History

Repository files navigation

A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning

Organization of the Code

BSD

RL

Reproducing paper figures

Software requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages