Calculating derived parameters

2022-03-23

One of the advantages of Bayesian MCMC analysis is the ability to compute derived parameters from the parameters sampled during MCMC. Since MCMC chains are available for the primary parameters, chains can also be calculated for the derived parameters, thus providing posterior samples for those derived parameters which were not sampled themselves during the MCMC run.

Preparation

In this vignette, we wil use the same network model as in the Quick Start tutorial.

exp <- tibble::tribble(
  ~time.day,  ~species, ~biomass, ~prop15N,
          0,   "algae",     1.02,  0.00384,
          1,   "algae",       NA,   0.0534,
        1.5,   "algae",    0.951,       NA,
          2,   "algae",    0.889,   0.0849,
        2.5,   "algae",       NA,   0.0869,
          3,   "algae",    0.837,   0.0816,
          0, "daphnia",     1.74,  0.00464,
          1, "daphnia",       NA,  0.00493,
        1.5, "daphnia",     2.48,       NA,
          2, "daphnia",       NA,  0.00831,
        2.5, "daphnia",     2.25,       NA,
          3, "daphnia",     2.15,   0.0101,
          0,     "NH4",    0.208,     0.79,
          1,     "NH4",    0.227,       NA,
        1.5,     "NH4",       NA,    0.482,
          2,     "NH4",    0.256,    0.351,
        2.5,     "NH4",       NA,    0.295,
          3,     "NH4",     0.27,        NA
  )

# Separate initial conditions and observations
inits <- exp %>% filter(time.day == 0)
obs <- exp %>% filter(time.day > 0)

# Build the network model
mod <- new_networkModel() %>%
  set_topo("NH4 -> algae -> daphnia -> NH4") %>%
  set_init(inits, comp = "species", size = "biomass",
           prop = "prop15N") %>%
  set_obs(obs, comp = "species", size = "biomass",
          prop = "prop15N", time = "time.day")
## Using default distribution family for proportions ("gamma_cv").
##   (eta is the coefficient of variation of gamma distributions.)
## Using default distribution family for sizes ("normal_cv").
##   (zeta is the coefficient of variation of normal distributions.)
# Set reasonable but vague priors for this model
mod <- set_priors(mod, normal_p(0, 5))

We run the MCMC sampler to fit the model:

fit <- run_mcmc(mod, iter = 1000)
plot(fit)
# Note: the figure below only shows a few of the traceplots for vignette concision

plot of chunk unnamed-chunk-4

Selecting primary parameters

The primary parameters sampled during MCMC are:

coda::varnames(fit)
## [1] "eta"                      "lambda_algae"             "lambda_daphnia"          
## [4] "lambda_NH4"               "upsilon_algae_to_daphnia" "upsilon_daphnia_to_NH4"  
## [7] "upsilon_NH4_to_algae"     "zeta"

We can select parameters using the select() function. For example:

plot(fit %>% select(zeta))

plot of chunk unnamed-chunk-6

Note that selecting parameters in a MCMC output uses partial matching:

plot(fit %>% select(upsilon))

plot of chunk unnamed-chunk-7

Selecting chains with select() is mostly useful in interactive mode, when exploring the results of an MCMC run. You can select individual parameters without any partial matching using the unambiguous[, ""] notation:

summary(fit[, "eta"])
## 
## Iterations = 501:1000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 500 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##       0.128347       0.046998       0.001051       0.001949 
## 
## 2. Quantiles for each variable:
## 
##    2.5%     25%     50%     75%   97.5% 
## 0.06567 0.09649 0.11836 0.15177 0.23885

The [, ""] notation is the one to use when you want to combine primary parameters into derived parameters through calculation.

Calculating derived parameters

Basic arithmetic operations as well as \(\log\) and \(\exp\) can be used when doing calculation on parameter chains. For example, to calculate the turnover rate of algae (i.e. the proportion of nutrient exiting the algae compartment per unit time), we can sum the rates of all the outgoing fluxes from algae, in this case the transfer rate from algae to daphnia and the loss rate from algae:

trate_algae <- fit[, "upsilon_algae_to_daphnia"] + fit[, "lambda_algae"]

The newly formed object, trate_algae, is a legitimate mcmc.list object itself:

str(trate_algae)
## List of 4
##  $ : 'mcmc' num [1:500] 0.2 0.126 0.171 0.178 0.402 ...
##   ..- attr(*, "mcpar")= num [1:3] 501 1000 1
##  $ : 'mcmc' num [1:500] 0.2451 0.1146 0.0659 0.056 0.2187 ...
##   ..- attr(*, "mcpar")= num [1:3] 501 1000 1
##  $ : 'mcmc' num [1:500] 0.2604 0.0663 0.1057 0.1074 0.2063 ...
##   ..- attr(*, "mcpar")= num [1:3] 501 1000 1
##  $ : 'mcmc' num [1:500] 0.145 0.266 0.22 0.268 0.161 ...
##   ..- attr(*, "mcpar")= num [1:3] 501 1000 1
##  - attr(*, "class")= chr [1:3] "derived.mcmc.list" "networkModelStanfit" "mcmc.list"

All the usual operations done on mcmc.list objects can be used with it, such as visualizing the trace:

plot(trate_algae)

plot of chunk unnamed-chunk-11

or calculating summaries:

summary(trate_algae)
## 
## Iterations = 501:1000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 500 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##       0.183862       0.071829       0.001606       0.002372 
## 
## 2. Quantiles for each variable:
## 
##    2.5%     25%     50%     75%   97.5% 
## 0.07329 0.13113 0.17659 0.22582 0.35040

A derived parameter can itself be used in subsequent calculations! For example, if we want to calculate the turnover time for the nutrient in algae, we can take the inverse of the turnover rate:

ttime_algae <- 1 / trate_algae
plot(ttime_algae)

plot of chunk unnamed-chunk-13