--- title: "Simple MCMC under SIR" author: "Lam ST Ho and Marc A Suchard" date: "`r Sys.Date()`" output: pdf_document vignette: > %\VignetteIndexEntry{Simple_SIR_MCMC} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- We describe how to set-up and run a simple Metropolis-Hastings-based Markov chain Monte Carlo (MCMC) sampler under the susceptible-infected-removed (SIR) model. ```{r} library(MultiBD) ``` This example uses the Eyam data that consist the population counts of susceptible, infected and removed individuals across several time points. ```{r} data(Eyam) Eyam ``` The log likelihood function is the sum of the log of the transition probabilities between two consecutive observations. Note that, we will use $(\log \alpha, \log \beta)$ as parameters instead of $(\alpha, \beta)$. The rows and columns of the transition probability matrix returned by *dbd_prob()* correspond to possible values of $S$ (from $a$ to $a0$) and $I$ (from $0$ to $B$) respectively. ```{r} loglik_sir <- function(param, data) { alpha <- exp(param[1]) # Rates must be non-negative beta <- exp(param[2]) # Set-up SIR model drates1 <- function(a, b) { 0 } brates2 <- function(a, b) { 0 } drates2 <- function(a, b) { alpha * b } trans12 <- function(a, b) { beta * a * b } sum(sapply(1:(nrow(data) - 1), # Sum across all time steps k function(k) { log( dbd_prob( # Compute the transition probability matrix t = data$time[k + 1] - data$time[k], # Time increment a0 = data$S[k], b0 = data$I[k], # From: S(t_k), I(t_k) drates1, brates2, drates2, trans12, a = data$S[k + 1], B = data$S[k] + data$I[k] - data$S[k + 1], computeMode = 4, nblocks = 80 # Compute using 4 threads )[1, data$I[k + 1] + 1] # To: S(t_(k+1)), I(t_(k+1)) ) })) } ``` Here, we choose $\text{Normal}(0, 100^2)$ as the prior for both $\log \alpha$ and $\log \beta$. ```{r} logprior <- function(param) { log_alpha <- param[1] log_beta <- param[2] dnorm(log_alpha, mean = 0, sd = 100, log = TRUE) + dnorm(log_beta, mean = 0, sd = 100, log = TRUE) } ``` We will use the random walk Metropolis algorithm implemented in the function *MCMCmetrop1R()* (**MCMCpack** package) to explore the posterior distribution. So, we first need to install the package and its dependencies. ```{r loadStuff, message=FALSE, warning=FALSE, echo=TRUE, eval=FALSE} source("http://bioconductor.org/biocLite.R") biocLite("graph") biocLite("Rgraphviz") install.packages("MCMCpack", repos = 'http://cran.us.r-project.org') library(MCMCpack) ``` ```{r doLoadStuff, echo=FALSE, eval=TRUE, include=FALSE} # Provide manual caching because knitr's caching # is not working in my hands file <- system.file("vignetteCache", "post_sample.RData", package="MultiBD") if (!file.exists(file)) { <> } ``` The starting point of our Markov chain is the estimated value of $(\alpha, \beta)$ from Raggett (1982). ```{r} alpha0 <- 3.39 beta0 <- 0.0212 ``` We discard the first $200$ iterations and keep the next $1000$ iterations of the chain. ```{r longMCMCRun, eval=FALSE} post_sample <- MCMCmetrop1R(fun = function(param) { loglik_sir(param, Eyam) + logprior(param) }, theta.init = log(c(alpha0, beta0)), mcmc = 1000, burnin = 200) ``` ```{r runLongMCMCRun, echo=FALSE, eval=TRUE} # Provide manual caching because knitr's caching # is not working in my hands if (file.exists(file)) { load(file) } else { <> # dir.create("cache", showWarnings = FALSE) save(post_sample, file = "../inst/vignetteCache/post_sample.RData") } ``` The trace plots of both $\log \alpha$ and $\log \beta$ look good. ```{r} plot(as.vector(post_sample[,1]), type = "l", xlab = "Iteration", ylab = expression(log(alpha))) plot(as.vector(post_sample[,2]), type = "l", xlab = "Iteration", ylab = expression(log(beta))) ``` We can visualize the joint posterior distribution of $\log \alpha$ and $\log \beta$ using the **ggplot2** package. ```{r plot, warning=FALSE} library(ggplot2) x = as.vector(post_sample[,1]) y = as.vector(post_sample[,2]) df <- data.frame(x, y) ggplot(df,aes(x = x,y = y)) + stat_density2d(aes(fill = ..level..), geom = "polygon", h = 0.26) + scale_fill_gradient(low = "grey85", high = "grey35", guide = FALSE) + xlab(expression(log(alpha))) + ylab(expression(log(beta))) ``` We can also construct the $95\%$ Bayesian credible intervals for $\alpha$ and $\beta$. ```{r} quantile(exp(post_sample[,1]), probs = c(0.025,0.975)) quantile(exp(post_sample[,2]), probs = c(0.025,0.975)) ```