/working/auteur.extended/man/rjmcmc.bm.Rd

http://github.com/eastman/auteur · Unknown · 99 lines · 88 code · 11 blank · 0 comment · 0 complexity · 7af0371cda6fc5796d454b3963257a64 MD5 · raw file

  1. \name{rjmcmc.bm}
  2. \alias{rjmcmc.bm}
  3. \title{Bayesian sampling of trait evolutionary rates}
  4. \description{Implements reversible-jump Markov chain Monte Carlo sampling to identify shifts in the process of continuous trait evolution across phylogenetic trees}
  5. \usage{rjmcmc.bm(phy, dat, SE=0, ngen = 1000, sample.freq = 100, prob.mergesplit = 0.05, prob.root = 0.01, lambdaK = log(2), constrainK = FALSE, jumpsize = NULL, simplestart = FALSE, internal.only = FALSE, summary = TRUE, fileBase = "result")}
  6. \arguments{
  7. \item{phy}{a phylogenetic tree of class 'phylo'}
  8. \item{dat}{a named vector of continuous trait values, associated with each species in \code{phy}; see \code{\link[geiger]{name.check}}}
  9. \item{SE}{a named vector of standard errors associated with each trait value; if a single value is given (e.g., \code{SE=0.05}), the standard error is assumed to apply for all trait values}
  10. \item{ngen}{number of sampling generations}
  11. \item{sample.freq}{frequency with which Markov samples are retained (e.g., \code{sample.freq=10} retains every tenth sample in the chain)}
  12. \item{prob.mergesplit}{a value from 0 to 1, signifying the proportion of proposals that split rates or merge rates, thereby increasing or decreasing complexity of the evolutionary model}
  13. \item{prob.root}{a value from 0 to 1, governing the proportion of proposals in which the root state is updated}
  14. \item{lambdaK}{the shape parameter for the Poisson distribution used as a prior on the number of distinct rates in the tree (see \code{\link[stats]{rpois}})}
  15. \item{constrainK}{if given an integer, Markov sampling becomes dimensionally constrained (e.g., \code{constrainK=2} would restrict posterior
  16. samples to be drawn only from models with two independent rates)}
  17. \item{jumpsize}{if \code{NULL}, the proposal width is intialized through a calibration period (see \code{\link{calibrate.jumpsize}}); otherwise,
  18. proposal width is constrained to the supplied value}
  19. \item{simplestart}{if \code{FALSE}, a random starting point (with \code{k} rate parameters) is chosen; if \code{TRUE}, Markov sampling begins under a global-rate model}
  20. \item{internal.only}{a logical switch, dictating whether any branch in the tree can be chosen for a rate shift (\code{internal.only=FALSE})
  21. or whether all but tips can be sampled for a rate shift}
  22. \item{summary}{a logical switch that determines whether any output is printed to the console after the analysis has finished}
  23. \item{fileBase}{a string used to uniquely identify the base directory to which Markov samples are written}
  24. }
  25. \details{
  26. In the two primary objects that store posterior estimates of \code{rate.shifts} and \code{rates} (see \bold{Value}), branches are labeled in accordance with the numeric identifiers found in \code{phy$edge}, which represents a table of ancestor-descendant
  27. relationship (see \code{\link[ape]{read.tree}} and the first plot from the \bold{Examples} below). In the \bold{Examples} below do \bold{not} expect convergence with such short chains!
  28. }
  29. \value{
  30. If \code{summary=TRUE} the global rate of acceptance and calibrated proposal width is returned as a list. After a run has completed, the rates of acceptance for various proposal mechanisms are printed, along with
  31. the calibrated jumpsize used for the last nine-tenths of the chain.
  32. Posterior results are written to several files within a base directory, the contents of which are as follows:
  33. \item{logfile}{a logfile including the following for each Markov sample; the generations at which samples were retained; the generalized model of trait evolution used for inference (e.g., Brownian motion or 'BM");
  34. the mean treewide rate in the sampled generation; the number of independent rates; the root state; and the likelihood of the model at each sample. Do not be alarmed if it seems to take some time before values
  35. begin to appear in the \code{logfile}: samples are not saved until after the calibration period has terminated }
  36. \item{errors}{a logfile to which errors are recorded, if any are generated in the Markov chain}
  37. \item{posteriorsamples}{a compressed .rda file, storing estimates from the Markov chain for relevant parameters; contents of this file can be loaded
  38. with \code{load("posteriorsamples.rda")}. This list object, whose referent is also \code{posteriorsamples} when in R-memory, includes two dataframes:
  39. \code{rates} comprises the relative-rate estimates for each branch in the phylogeny; \code{rate.shifts} stores the branches that were chosen for
  40. rate shifts in each retained sample of the Markov chain (\code{1} signifies a shift;
  41. \code{0} implies a branch that was not sampled for a rate shift). Column names of both components of \code{posteriorsamples} are from the numeric vector of
  42. branch labels (see \bold{Details}). For further information on .Rd and .rda files, see \code{\link[base]{save}} and \code{\link[base]{load}} as well as the
  43. \bold{Examples} below. }
  44. }
  45. \author{LJ Harmon, AL Hipp, and JM Eastman}
  46. \examples{
  47. #############
  48. ## generate tree
  49. n=24
  50. phy=prunelastsplit(birthdeath.tree(b=1,d=0,taxa.stop=n+1))
  51. # plot phylogeny with branches labeled as they are identified in by rjmcmc.bm
  52. plot(phy, label.offset = strwidth(par("pch"), cex = 0.35)) ## PLOT ONE ##
  53. edgelabels(text=phy$edge[,2], frame="c", bg=add.transparency("white",0.75), cex=0.75)
  54. mtext("branch labels used by the reversible-jump sampler")
  55. #############
  56. ## simulate a rate shift
  57. # find an internal edge
  58. anc=get.desc.of.node(Ntip(phy)+1,phy)
  59. branches=phy$edge[,2]
  60. branches=branches[branches>Ntip(phy) & branches!=anc]
  61. branch=branches[sample(1:length(branches),1)]
  62. desc=get.descendants.of.node(branch,phy)
  63. rphy=phy
  64. rphy$edge.length[match(desc,phy$edge[,2])]=phy$edge.length[match(desc,phy$edge[,2])]*64
  65. e=numeric(nrow(phy$edge))
  66. e[match(c(branch,desc),phy$edge[,2])]=1
  67. cols=rev(diverge_hcl(n=2))
  68. dev.new()
  69. plot(phy,edge.col=ifelse(e==1,cols[1],cols[2]), edge.width=2) ## PLOT TWO ##
  70. mtext("expected pattern of rates")
  71. ## simulate data on the 'rate-shifted' tree
  72. dat=rTraitCont(phy=rphy, model="BM", sigma=sqrt(0.1))
  73. ## run two short reversible-jump Markov chains
  74. r=paste(sample(letters,9,replace=TRUE),collapse="")
  75. lapply(1:2, function(x) rjmcmc.bm(phy=phy, dat=dat, ngen=10000, sample.freq=10, prob.mergesplit=0.1, simplestart=TRUE, jumpsize=1, fileBase=paste(r,x,sep=".")))
  76. # collect directories
  77. dirs=dir("./",pattern=paste("BM",r,sep="."))
  78. pool.rjmcmcsamples(base.dirs=dirs, lab=r)
  79. ## view contents of .rda
  80. load(paste(paste(r,"combined.rjmcmc",sep="."),paste(r,"posteriorsamples.rda",sep="."),sep="/"))
  81. print(head(posteriorsamples$rates))
  82. print(head(posteriorsamples$rate.shifts))
  83. ## plot Markov sampled rates
  84. dev.new()
  85. shifts.plot(phy=phy, base=paste(r,"combined.rjmcmc",sep="."), burnin=0.5, legend=TRUE, pdf=FALSE, edge.width=2) ## PLOT THREE ##
  86. ## PASTE UNCOMMENTED FOLLOWING LINE TO DROP DIRECTORIES CREATED BY RJMCMC
  87. # unlink(dir(pattern=paste(r)),recursive=TRUE)
  88. }