This vignette shows a case study of New York influenza data, using phylodyn. We start by loading the phylodyn package.
library(phylodyn)In preparation, we aligned the sequences using the software MUSCLE, and inferred a maximum clade credibility genealogy using the software BEAST1. We packaged this genealogy in a phylo object NY_flu, and we load it now.
data(NY_flu)We use BNPR and BNPR_PS to calculate approximate marginals (without and with a sampling model).
NY_cond = BNPR(data = NY_flu, lengthout = 100)
NY_pref = BNPR_PS(data = NY_flu, lengthout = 100)We plot the results (we use a scaling factor because we stored the object with units of weeks, and we standardize on units of years).
axlabs = list(x = seq(0, 12*52, by=52) + 10, labs = seq(2005, 1993, by=-1))
par(mfrow=c(1,3), cex=0.9, cex.lab=1.5, cex.main=1.7, oma=c(2.5, 2, 0, 0)+0.1,
    mar=c(2,1.5,2,1), mgp = c(2.5,1,0), xpd=NA,
    fig=c(0, 0.32, 0, 1))
plot_BNPR(NY_cond, main="BNPR", ylim = c(10, 500)/52, yscale = 1/52,
          col = rgb(0.829, 0.680, 0.306), axlabs = axlabs, heatmap_labels_side = "left")
par(fig=c(0.32, 0.64, 0, 1), new=TRUE)
plot_BNPR(NY_pref, main="BNPR-PS", ylim = c(10, 500)/52, yscale = 1/52, ylab = "",
          col = rgb(0.330, 0.484, 0.828), axlabs = axlabs, heatmap_labels = FALSE)
par(mar=c(2,3.5,2,1), fig=c(0.64, 1.0, 0, 1), new=TRUE)
plot_mrw(list(NY_cond, NY_pref), axlabs = axlabs, ylab = "Mean Relative Width",
         cols = c(rgb(0.829, 0.680, 0.306), rgb(0.330, 0.484, 0.828)),
         legends = c("BNPR", "BNPR-PS"), legend_place = "topright", legend_cex = 0.8)R. C. Edgar. MUSCLE: Multiple sequence alignment with high accuracy and high through- put. Nucleic Acids Research, 32:1792–1797, 2004.
A. J. Drummond, M. A. Suchard, D. Xie, and A. Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular Biology and Evolution, 29:1969–1973, 2012.
A. Rambaut, O. G. Pybus, M. I. Nelson, C. Viboud, J. K. Taubenberger, and E. C. Holmes. The genomic and epidemiological dynamics of human influenza A virus. Nature, 453 (7195):615–619, 2008.
M. D. Karcher, J. A. Palacios, T. Bedford, M. A. Suchard, and V. N. Minin. Quantifying and mitigating the effect of preferential sampling on phylodynamic inference. arXiv preprint arXiv:1510.00775, 2015.
We infer the genealogy branch lengths in units of years using a strict molecular clock, a constant effective population size prior, and an HKY substitution model with the first two nucleotides of a codon sharing the same estimated transition matrix, while the third nucleotide’s transition matrix is estimated separately.↩