Beast Phylogenetics Manual

Posted on admin
Beast Phylogenetics Manual Average ratng: 6,0/10 1102 votes

Computational evolutionary biology, statistical phylogenetics and coalescent-based population genetics are becoming increasingly central to the analysis and understanding of molecular sequence data. We present the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package version 1.7, which implements a family of Markov chain Monte Carlo (MCMC) algorithms for Bayesian phylogenetic inference, divergence time dating, coalescent analysis, phylogeography and related molecular evolutionary analyses.

  1. Star Beast Phylogenetics
  2. Beast Phylogeny

This package includes an enhanced graphical user interface program called Bayesian Evolutionary Analysis Utility (BEAUti) that enables access to advanced models for molecular sequence and phenotypic trait evolution that were previously available to developers only. The package also provides new tools for visualizing and summarizing multispecies coalescent and phylogeographic analyses. BEAUti and BEAST 1.7 are open source under the GNU lesser general public license and available at. IntroductionMolecular sequences, morphological measurements, geographic distributions and fossil remains all provide a wealth of potential information about the evolutionary history of life on Earth, the dynamics of ancient and modern biological populations, and the emergence and spread of infectious diseases.

One of the challenges of modern Evolutionary Biology is the integration of these different data sources to address evolutionary hypotheses over the full range of spatial and temporal scales. The field is witnessing a transition to an increasingly quantitative science.

This transformation began first through an explosion of molecular sequence data with the parallel development of mathematical and computational tools for their analysis. However, increasingly this transformation can be observed in other aspects of Evolutionary Biology where large global databases of complementary sources of information, such as fossils, geographical distributions and population history are being curated and made publicly available.

Software AdvancesHere, we present a major new version of the molecular evolutionary software package Bayesian Evolutionary Analysis by Sampling Trees (BEAST), updated to version 1.7, and representing a significant software advance over that previously described. Alongside the primary analysis engine in BEAST, this package also includes a suite of utilities for specifying the analysis design, processing output files and summarizing and visualizing the results. Taken together, these programs enable Bayesian inference of molecular sequences with an emphasis on time-structured evolutionary models including phylodynamic models, divergence time estimates, multiloci demographic models, gene/species–tree inference, a range of spatial phylogeographic analyses and discrete and continuous trait evolution.

FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures. As with most of my programs, it was written for my own needs so may not be as polished and feature-complete as a commercial program.

Implementing Markov chain Monte Carlo (MCMC) algorithms to perform these inferences, the package is intended and used for rigorous statistical inference and hypothesis testing of evolutionary models with joint inference of phylogeny. It is also possible to constrain portions of the phylogenetic model space to known values, including the tree topology, and perform conditional inference if required. User InterfaceOne area of significant improvement since the last release publication is in the analysis construction and model specification tool called Bayesian Evolutionary Analysis Utility (BEAUti).

This acts as the graphical user interface (GUI) for BEAST and allows the user to import data, select models, choose prior distributions on individual parameters and specify the settings for the MCMC sampler. Although the BEAST model specification format (a standard XML format structured text file) allows for great flexibility in the construction of complex evolutionary models, the constraints of a GUI unavoidably restrict the scope of the researcher to a prespecified set of models and combinations, hiding many advanced inference models. Working directly within the BEAST XML input format, on the other hand, represents a high barrier to the accessibility of BEAST and incurs significant risk of inadvertent errors being introduced into the model. We have concentrated development efforts on BEAUti to provide greater flexibility in model specification while still maintaining the benefits of a visual, table-based representation of the model and automatic generation of BEAST XML files.

Improvements to BEAUti provide support for multiple data partitions in a joint analysis and the input of fossil calibration and trait information. Heterogeneous DataMultiple data partitions may reflect separate loci for simultaneous inference of genealogies and species trees and stochastic ancestral recombination graph reconstruction or the growing wealth of nonsequence data and their respective substitution models. These latter data and models include microsatellite markers , phenotypic traits under a multistate stochastic Dollo process , discretized geographic diffusion , and multivariate continuous relaxed random walks.We also ease the use of a growing number of tree prior specifications. These include the extended Bayesian skyline model for multilocus data, the flexible Gaussian Markov random field skyride model , and birth–death models of speciation. Molecular ClocksWe have refined the relaxed clock models to allow more than one branch to have the same rate value to remove anticorrelation. In practice this will only have any appreciable impact on trees that have a small number of branches (. Examplespresents a reconstruction of the gene tree relating 13 species of Darwin's finches from a 2,065-bp partial nucleotide alignment of the mitochondrial control region and cytochrome b genes and five continuously measured phenotypic traits of the corresponding species.

In performing this simultaneous inference, we exploit the RLC model and find evidence for one suggestive rate change (Bayes factor in favor of the RLC over a strict clock = 2.3) in the lineage leading to the Cocos Island Finch, Pinaroloxias inornata. Multivariate Brownian trait diffusion shows strong correlation between wing and tarsus length and between bill depth and gonys length. Posterior trait prediction at any point along the history is possible and, currently unique to BEAST, comparative method inference is performed jointly with phylogenetic inference. Simultaneous phylogenetic and phenotypic trait reconstruction of Darwin's finches.

Plotted are the maximum clade credibility tree and posterior estimate of the trait correlation matrix. We annotate the tree with estimates of selected posterior clade support values and the one significant nucleotide substitution local clock (in red) and the branches scale in expected substitutions per site. We depict correlation coefficients through their bivariate ellipse sizes, where more highly correlated phenotypes return narrower ellipses.Our second example demonstrates the application of the multispecies coalescent model (.BEAST) to a 1,165-bp fragment of the mitochondrial genome sequenced from 16 Darwin's finches representing four species ( Geospiza fortis, G. Magnirostris, Camarhynchus parvulus, and Certhidea olivacea). Shows 1) a representative gene tree and 2) the two species trees with highest posterior probability. The 99% credible set for the species tree contains 3 of the 15 possible tree topologies: 65.8% (((F, M),P), O); 17.2% ((F, M),(P, O)); 16.5% (((F, M),O), P).

This uncertainty in the species tree arises despite overwhelming support for Certhidea olivacea and Camarhynchus parvulus as the nested outgroup species according to the gene tree ( a), due to the possibility of incomplete lineage sorting in the deeper branches of the gene tree. The possibility of incomplete lineage sorting can be appreciated in c, in which a representative gene tree is embedded inside the most probable species tree topology for this data, showing extensive incomplete lineage sorting in the Geospiza clade and also depicting the reason that species trees necessarily have (sometimes much) younger divergence times than the corresponding gene tree might suggest. This example demonstrates that even for single-gene analyses, the multispecies coalescent can provide 1) important insight into the potential for incomplete lineage sorting, 2) more accurate assessment of uncertainty in the species tree estimate and 3) better estimates of species divergence times.

( a) Representative gene tree of mitochondrial DNA fragment from 16 Darwin's finches of four species ( Geospiza fortis, G. Magnirostris, Camarhynchus parvulus, and Certhidea olivacea).

Nodes that have posterior clade probabilities of greater than 0.5 are labeled with their posterior clade probability. ( b) The two most probable species trees (solid line represents most probable species tree; dashed line is second most probable). ( c) Gene tree embedded in a point estimate of the species tree, including divergence times and effective population sizes. The x axis is divergence time in units of substitutions per site and the y axis is proportional to effective population size. Availability and Future DirectionsWe make the BEAST package available in both executable and source code forms. BEAST requires Java version 1.5 or greater and executables for Windows, Mac OS and Linux platforms are located at which serves as the main page for the package. This page also links to a sizable list of self-contained step-by-step tutorials covering basic to advance usage of BEAST.

Popular tutorials describe how to use BEAST to infer population dynamics and phylogeographic processes and walk users all the way through to generating a range of graphical summaries of their results.GoogleCode houses the BEAST's version-controlled source code at and links to two GoogleGroup discussion groups related to BEAST. The first is the “beast-users” group with over 1,500 members. At the time of writing, 47 developers belong to the “beast-dev” group that facilitates BEAST development across three continents.Future development directions for BEAUti and BEAST focus on easing the user experience in several ways. These include in fitting hierarchical phylogenetics models that commonly arise in studies of intrahost viral evolution, in exploiting MarkovJump methods (; ) for computationally efficient and robust estimation of complex evolutionary processes under simple models, and in specifying phylogeographic models in a convenient geographical user interface.

Beast phylogenetics

We thank the National Evolutionary Synthesis Center for sponsoring a working group (Software for Bayesian Evolutionary Analysis) that facilitated the development of BEAST version 1.7. We would also like to thank the many developers and contributors to BEAST, including: Alex Alekseyenko, Trevor Bedford, Erik Bloomquist, Joseph Heled, Sebastian Hoehna, Philippe Lemey, Sibon Li, Gerton Lunter, Sidney Markowitz, Vladimir Minin, Michael Defoin Platel, Oliver Pybus, Beth Shapiro and Chieh-Hsi Wu.

This work was supported in part by funding from the Marsden Trust, National Science Foundation ( DMS 0856099), National Institute of Health ( R01 GM086887;, R01 HG006139), The Royal Society of London, Biotechnology and Biological Sciences Research Council ( BB/H011285/1) and the Wellcome Trust ( WT092807MA). Alekseyenko A, Lee C, Suchard M. Wagner and Dollo: a stochastic duet by composing two parsimonious solos. 2008; 57(5):772–784. Ayres D, Darling A, Zwickl D, et al. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. 2011; 61(1):170–173.

Bielejec F, Rambaut A, Suchard MA, Lemey P. Spread: spatial phylogenetic reconstruction of evolutionary dynamics. 2011; 27(20):2910–2. Bloomquist E, Suchard M.

Unifying vertical and nonvertical evolution: a stochastic ARG-based framework. 2010; 59(1):27–41.

Drummond AJ, Rambaut A. Beast: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. Drummond AJ, Suchard MA. Bayesian random local clocks, or one rate to rule them all. Heled J, Drummond AJ. Bayesian inference of population size history from multiple loci.

BMC Evol Biol. Heled J, Drummond AJ. Bayesian inference of species trees from multilocus data. Mol Biol Evol. 2010; 27(3):570–580. Heled J, Drummond AJ. Calibrated tree priors for relaxed phylogenetics and divergence time estimation.

2011; 61(1):138–149. Lemey P, Rambaut A, Drummond AJ, Suchard MA. Bayesian phylogeography finds its roots. PLoS Comput Biol. 2009; 5(9):e1000520. Lemey P, Rambaut A, Welch J, Suchard M. Phylogeography takes a relaxed random walk in continuous space and time.

Mol Biol Evol. 2010; 27:1877–1885.

Li WLS, Drummond AJ. Model Averaging and Bayes Factor Calculation of Relaxed Molecular Clocks in Bayesian Phylogenetics. Mol Biol Evol. 2012; 29:751–761.

McCormack JE, Heled J, Delaney KS, Peterson AT, Knowles LL. Calibrating divergence times on species trees versus gene trees: implications for speciation history of aphelocoma jays. 2011; 65:184–202. Minin V, Suchard M. Counting labeled transitions in continuous-time Markov models of evolution. 2008; 56:391–412. Minin VN, Bloomquist EW, Suchard MA.

Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics. Mol Biol Evol. 2008; 25:1459–1471. O'Brien J, Minin V, Suchard M.

Learning to count: robust estimates for labeled distances between molecular sequences. Mol Biol Evol. 2009; 26:801–814.

Rambaut A, Ho S, Drummond AJ, Shapiro B. Accommodating the effect of ancient DNA damage on inferences of demographic histories.

Mol Biol Evol. 2009; 26:245–248. Sato A, O'hUigin C, Figueroa F, Grant P, Grant B, Tichy H, Klein J.

Phylogeny of Darwin's finches as revealed by mtDNA sequences. Proc Natl Acad Sci U S A. 1999; 96:5101–5106. Shapiro B, Ho S, Drummond AJ, Suchard M, Pybus O, Rambaut A. A Bayesian phylogenetic method to estimate unknown sequence ages.

Mol Biol Evol. 2011; 28:879–887. Stadler T. Sampling-through-time in birth–death trees. J Theor Biol. 2010; 267:396–404. Suchard MA, Rambaut A.

Many-core algorithms for statistical phylogenetics. 2009; 25:1370–1376. Suchard MA, Kitchen CMR, Sinsheimer JS, Weiss RE. Hierarchical phylogenetic models for analyzing multipartite sequence data. 2003; 52:649–664. Sulloway F.

The Beagle collections of Darwin's finches (Geospizinae) Bull Br Mus. 1982; 43:49–94.

Wu C, Drummond AJ. Joint inference of microsatellite mutation models, population history and genealogies using trans-dimensional MCMC. 2011; 188:151–164.

Contents.IntroductionThe goal of this lab is to introduce you to MrBayes, one of the two major software packages for conducting Bayesian phylogenetic analysis (the other being BEAST). Bayesian phylogenetics:Bayesian inference in phylogeny (also called Bayesian phylogenetics) is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree given the observations (alignment of sequences) and the model.

The conditioning is accomplished using Bayes's theorem. As the posterior probability distribution of trees is impossible to calculate analytically, phylogenetic programs using Bayesian approach (such as MrBayes or PhyloBayes) use a simulation technique called Markov chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees. MrBayesMrBayes is a free program for the Bayesian estimation of phylogeny, developed by John Huelsenbeck, Bret Larget, Paul van der Mark, Fredrik Ronquist, and Donald Simon.

Tracy Heath, Conor Meehan, and Brian Moore wrote an excellent MrBayes for workshops on applied phylogenetics and molecular evolution, which I abbreviated for this lab. I encourage you to read the original version on your own to get a better understanding of the theory involved. Getting Started SoftwareDownload the newest versions of, and, if you work on your own computer.DatasetsHere is the for the exercises.Exercise 1: Model Selection & Partitioning using Bayes FactorsThe goal of the first exercise is to select a partitioning scheme for a dataset. For most sequence alignments, several to many partition schemes of varying complexity are plausible a priori and the choice of a partitioning scheme may influence the results of the analysis. Our goal, therefore, is to identify the partition scheme that balances estimation bias and error variance associated with under- and over-parameterized mixed models, respectively.

In Bayesian inference, this goal is acchieved based on Bayes factors, which compare the ratio of the marginal likelihoods for the set of candidate partition schemes. The analysis pipeline that we will use in this tutorial is depicted below. Open the file coniferdna.nex in your text editor. This file contains the sequences for 2 different genes sampled from 9 species. The elements of the DATA block indicate the type of data, number of taxa, and length of the sequences.Open the batch file, coniferpartn.nex, in a text editor. This file contains all of the commands required to perform the necessary analyses to explore various partition schemes (unpartitioned, parti- tioned by gene region, and partitioned by gene region+codon position).

The details of each command are described in adjacent comments, surrounded in brackets; e.g., this is a comment. Running the analysis. MrBayes lset nst=6 rates=gammaThis command specifies a substitution matrix with six relative substitution rates (nst=6) with gamma- distributed rate variation across sites (rates=gamma).

Because models are specified this way, some DNA models are not available in MrBayes. With the nst element of the lset command, we can specify the JC69 or F81 models (nst=1), the K2P or HKY models (nst=2), or the GTR model (nst=6).In Bayesian analysis we treat parameters as random variables, which requires that we specify a prior probability density for them. Accordingly, we need to specify priors for all of the parameters of the specified nucleotide substitution model. The command for modifying priors is the prset command. Use the help command to view the list of priors available for modification and their current values. MrBayes help prsetThe help page starts with a short explanation of each prior option. Most of the primer distributions should look familiar to you (review our intro to R exercise, if not!).

Let's go through some important priors:. First, we have a flat Dirichlet prior on the 6 exchangeability parameters: revmatpr=dirichlet(1,1,1,1,1,1);. We also have a flat Dirichlet prior on the 4 base frequencies (πA, πC, πG, πT): statefreqpr=dirichlet(1,1,1,1). We also need a prior for the rates heterogeneity.

The default value of the Shapepr is Uniform(0.0,200.0). We will change it to an exponential distribution with a rate parameter, λ, equal to 0.05.

Use R to look at the shape of this distribution. MrBayes mcmcThe convergence diagnostics we've chosen (the maximum standard deviation of split frequencies) it monitors the topological similarity of trees sampled by our two independent analyses. A low standard deviation indicates that the data points (split frequencies) tend to be close to the mean, suggesting that the trees sampled by the independent chains are similar and presumably sampled from the same (stationary) distribution. By contrast, high values of the standard deviation indicates that the data points (split frequencies) tend to deviate greatly from the mean, suggesting that the trees sampled by the independent chains are quite different and presumably not sampled from the same (stationary) distribution Running the stepping-stone samplingIn the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of marginal likelihoods for the two models. Commonly, marginal likelihood was calculated by harmonic mean estimator.

New techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional method. In this part of the exercise, we will estimate marginal likelihood of the unpartitioned model using stepping- stone sampling.Specify the parameters of the stepping-stone sampling analysis using the ssp command. MrBayes ssOnce the stepping-stone sampling run has completed, the estimated stepping-stone marginal likeli- hood for the uniform partition is reported to the screen. Record the Mean marginal likelihood for the 2 runs.Partitioning by gene region Setting the modelThe dataset we use in this exercise contains two distinct gene regions—atpB and rbcL—so we may wish to explore the possibility that the substitution process differs between these two gene regions. This requires that we first specify the data partitions corresponding to these two genes, then define an independent substitution model for each data partition.First, use the charset command to define the subset of sites belonging to each of the gene regions (in our case sites 1–1,394 belong to the atpB gene and sites 1,395–2,659 are from rbcL). MrBayes sump filename=conifer-uniformUpon completion of the sump command, you will see a table listing the estimated marginal likelihoods of these analyses.

Record the marginal likelihood estimated by the harmonic mean for the uniform partition analysis. Review the table summarizing the MCMC samples of the various parameters.This table also give the 95% credible interval of each parameter. This statistic approximates the 95% highest posterior density (HPD) and is a measure of uncertainty while accounting for the data (MrBayes labels this value as 95% HPD). More specifically, the probability that the true value of the parameter lies within the credible interval is 0.95 given the model and the data.Continue summarizing the MC3 runs for the moderately partitioned run and the highly partitioned run and record the Harmonic mean estimate of the marginal likelihood for each. MrBayes sump filename=conifer-partnMrBayes sump filename=conifer-sat-partnNow that we have estimates of the marginal likelihood under each of our different models, we can evaluate their relative plausibility using Bayes factors.Use the table below (or one like it) to summarize the marginal log-likelihoods estimated using the harmonic mean and stepping-stone methods.

PartitionHarmonic meanStepping-stoneuniform (M1)by gene (M2)by gene and codon position (M3)Phylogenetics software programs log-transform the likelihood to avoid underflow, because multiplying likelihoods results in numbers that are too small to be held in computer memory. Thus, we must use a different form of Bayes factors equation to calculate the ln-Bayes factor (we will denote this value K):K = lnBF(M0,M1) = lnP(X M0) − lnP(X M1), where lnP(X M0) is the marginal lnL estimate for model M0. The value resulting from equation 4 can be converted to a raw Bayes factor by simply taking the exponent of KBF (M0, M1) = e KAlternatively, you can interpret the strength of evidence in favor of M0 using the K and skip the equation above. In this case, we evaluate the K in favor of model M0 against model M1 so that:if K 1, then model M0 winsif K. MrBayes sumt filename=conifer-uniformMrBayes sumt filename=conifer-partnMrBayes sumt filename=conifer-sat-partnThe primary summary performed by sumt calculates the clade credibility values (i.e., bipartition posterior probabilities). These values are reported on an ASCII cladogram upon completion of the sumt command. The sumt command also writes the majority-rule consensus tree to a NEXUS tree file with the file-name extension.con.tre.

Star Beast Phylogenetics

The trees in these files are also annotated with various branch- or node-specific parameters or statistics in an extended Newick format called NHX.Use FigTree to visualize these summary trees.We are done with our first exercise! MrBayes quit Exercise 2: Averaging Over the GTR Family of ModelsModel selection using maximum-likelihood methods (e.g., the likelihood-ratio test, AIC, BIC, etc.) has been a standard practice in the field of molecular phylogenetics. However, such approach ignores uncertainty in the choice of the model and can cause estimates to be biased. The Bayesian framework provides a more natural approach for accommodating model uncertainty by treating the models (like the parameters within each model) as random variables. Bayesian model averaging has been implemented for various phylogenetic problems using reversible-jump MCMC, where the chain integrates over the joint prior probability density of a given model in the usual manner, but also jumps between all possible candidate substitution models, visiting each model in proportion to its marginal probability.

Here, we will demonstrate how to use this approach using MrBayes.Start MrBayes application by typing. MrBayes showmodel Running under the priorFor Bayesian analysis, it’s critical to examine the various priors specified and identify induced priors that may result from interactions between parameters. This procedure is done by generating samples of the various parameters and hyperparameters under the prior, without accounting for the data. This is also often called “running on empty”.In MrBayes, running under the prior is specified in the mcmc/mcmcp command by the option data=no. When generating samples under the prior with MCMC, the only important concern is that you have a sufficient number. Therefore, it is not necessary to run multiple chains or multiple independent runs.Use the mcmcp command to specify the details of the Markov chain. MrBayes mcmc filename=conifer-rjmcmc-priorWe will use for examining marginal prior and posterior densities for Bayesian phylogenetic analysis.

Beast Phylogeny

Evaluating the parameter samples under the prior can often help to identify misspecified priors or errors in your analysis set-up. When examining prior densities, we are only concerned with the values reported to the parameter file (.p). There is no manual for Tracer. However, some explanation for different options is available on its.Open the file called conifer-rjmcmc-prior.p in Tracer. You may have to open a new terminal window and execute the tracer binary.

tracer conifer-rjmcmc-prior.pLook through each of the parameters, paying close attention to the shapes of the distributions in the Marginal Probability Distribution pane.Inspect the marginal densities of the relative exchangeability rates. Under this prior, the rates are sampled from a mixture of distributions, thus these look unlike any obvious parametric density. For this exercise, we are primarily interested in the variables relevant to the mixed model over GTR submodels. These include gtrsubmodel and krevmat. The krevmat statistic indicates the number of unique rate values in the GTR matrix. The trace called gtrsubmodel gives the model in the GTR family sampled by MCMC.We have verified that the marginal prior densities of the relevant parameters match the expected densities. In addition, sampling under the prior provides a straightforward way to assess whether the data are informative for the numerous parameters and hyperparameters in our model.

This is done by comparing the marginal prior densities to the marginal posterior densities (after running with data). MrBayes sump filename=conifer-rjmcmcThe sump command will generate tables showing summary statistics of the model parameters and the different GTR submodels with posterior probability over 0.05. For these data, the analysis shows that only a few of the 203 GTR submodels have posterior probability over 0.05. Furthermore, no single model stands out as the “best” model with a significantly high probability.We can get a more detailed view of these features of our data when we evaluate the marginal densities of the gtrsubmodel and krevmat in Tracer.

In a new terminal window open all of the.p files from this exercise in Tracer. Exercise 3: Testing a Topological HypothesisFor the third exercise, we will use Bayes factor comparisons to test a topological hypothesis, in this case an old one, that humans are more closely related to chimps than to other primates. This exercise is described starting on page 52 in manual and we will just follow the procedure from the manual.

This is done, in part, as an opportunity for you to run other types of analyses described in the manual, while the rest of the class finishes their work.Here is a quick summary of the exercise:Start MrBayes application by typing. MrBayes showmodel Testing a topological hypothesisAs we have seen above, MrBayes provides two methods for estimating marginal model likelihoods. The first is based on the harmonic mean of the likelihood values of the MCMC samples. It is simple to compute but it is a pretty rough estimate of the model likelihood. To obtain a more accurate model likelihood, MrBayes provides the stepping-stone method First, let us use the harmonic mean estimate of the model likelihoods of the two models we want to compare.First we enforce the positive constraint, run an mcmc analysis with 100,000 generations, and use sump to get the harmonic mean estimate. MrBayes prset topologypr=constraints(nohumanchimp)MrBayes mcmc ngen=100000MrBayes sumpWrite down harmonic mean estimate.It may also be interesting to look at the best estimate of the phylogeny under the assumption that humans and chimps are not each others sister groups.

Do this by typing sumt. As you will see, this tree groups chimps and gorillas together, with humans being just outside, as one might have expected.Let us now repeat the comparison using the more accurate stepping-stone sampling approach. Instead of using the mcmc command followed by the sump command, we simply use the ss command instead, which will produce the estimated model likelihood directly. The stepping-stone analysis moves from the posterior to the prior through a number of steps in which the sampled distribution is a mixture of varying proportions of the two.We will use 50 steps (the default) with 5,000 generations each, for a total of 250,000 generations.

To monitor convergence twice during each step, we set the diagnostics frequency to once every 2,500 generations. Stepping-stone analysis under the two models using these settings will be generated by the following commands.