This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, IQ-TREE, RevBayes, BayesTraits, and others) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.

Semester: Spring 2022
Lecture: Tuesday/Thursday 11:00-12:15 (Paul O. Lewis)
Lab: Thursday 1:25-3:20 (Zach Muscavitch)
Room: Torrey Life Science (TLS) 181, Storrs Campus (but first two weeks are online)
Text: Lewis, P. O. 2022. Getting Rooted in Bayesian Phylogenetics (unfinished, but some chapters are ready)


Date Lecture topic Lab/homework
Tuesday Jan. 18 Introduction
The jargon of phylogenetics (edges, vertices, leaves, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups [slides]
Homework 1: Trees From Splits
Thursday Jan. 20 Optimality criteria, search strategies, consensus trees
Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms; consensus trees [slides]
Lab 1: Using the Xanadu cluster, Introduction to PAUP*, and NEXUS file format
Tuesday Jan. 25 The parsimony criterion
Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony [slides] [study questions]
Homework 2: Parsimony
Thursday Jan 27 Distance methods
Distance methods: least squares criterion, minimum evolution criterion, neighbor-joining [slides] [study questions]
Lab 2: Searching
Tuesday Feb. 1 Substitution models
Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model. Textbook: Ch. 2 (pp. 19-30); Ch. 3 (pp. 35-38). [slides] [study questions]
Homework 3: Least squares distances (working through the Python Primer first will make this homework much easier)
Thursday Feb. 3 Maximum likelihood criterion
JC distance formula; common substitution models: K2P, F81, F84, HKY85, and GTR; likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests. [Transition Probability Applet] Textbook: Ch. 5 (pp. 57-75) [slides] [study questions]
Lab 3: Likelihood [slide]
Tuesday Feb. 8 Maximum likelihood (cont.)
Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree. [slides] Textbook: Ch. 4: pp. 47-53
Homework 4: Site likelihoods
Thursday Feb. 10 Bootstrapping, rate heterogeneity
Non-parametric bootstrapping [slides]
Rate heterogeneity
Invariable sites model, Discrete gamma model, site-specific rates (partitioned) models, mixture models. Textbook: Ch. 6: pp. 81-92. [slides]
Lab 4: IQ-TREE tutorial
Tuesday Feb. 15 Simulation
How to simulate nucleotide sequence data, and why it’s done [slides] Textbook: Ch. 6: pp. 93-96.
Homework 5: Rate heterogeneity (python program to modify)
Thursday Feb. 17 Long branch attraction, topology tests
Statistical consistency, long branch attraction, testing the molecular clock, nonparametric bootstrap topology tests (KH/SH/AU), and parametric bootstrapping tests (SOWH). [LBA slides] [Topology test slides]
Lab 5: Simulating sequences
Tuesday Feb. 22 Codon, secondary structure, and amino acid models
Nonsynonymous vs. synonymous rates, codon models, RNA stem/loop structure, compensatory substitutions, stem models, empirical amino acid rate matrices (PAM, JTT, WAG, LE) [slides] [Diagonalization applet]
Homework 6: Simulation
Thursday Feb. 24 Bayes’ Rule
Joint, conditional, and marginal probabilities, and how they interact to create Bayes’ Rule; Probability vs. probability density. [slides] Textbook: Ch. 7 (Bayes’ Rule; pp. 101-116)
Lab 6: Using R to explore probability distributions and plot trees
Tuesday Mar. 1 Bayesian statistics, MCMC
Metropolis-Hastings algorithm; mixing, burn-in, trace plots, heated chains, topology proposals, Updating parameters during MCMC. [slides] [MCMC robot applet]
Homework 7: MCMC
Thursday Mar. 3 Prior distributions used in phylogenetics
Discrete Uniform (topology), Gamma or Lognormal (kappa, omega), Beta (pinvar), Dirichlet (base frequencies, GTR exchangeabilities); Tree length prior. [Dirichlet applet] [Density rain applet] [slides] Textbook: Ch. 8 (MCMC; pp. 121-146)
Lab 7: To concatenate or not to concatenate, that is the question
Tuesday Mar. 8 Prior distributions (cont.) and CIs
Running on empty, prior fences, induced priors, hierarchical models, empirical Bayes; Frequentist confidence intervals vs. Bayesian credible intervals [slides] [CI applet]
Homework 8: Larget-Simon Local Move
Thursday Mar. 10 Dirichlet Process Prior
Bayesian non-parametric clustering: examples include BUCKy (genes clustered by topology); PhyloBayes (amino-acid sites clustered by frequency spectra) [Stick-breaking applet] [DPP applet] [slides]
Lab 8: MrBayes
Tuesday Mar. 15 SPRING BREAK  
Thursday Mar. 17 SPRING BREAK  
Tuesday Mar. 22 Bayes factors and Bayesian model selection
Bayes factors, steppingstone estimation of marginal likelihood, BIC vs. AIC [slides]
Homework 9: Dirichlet Process Priors
Thursday Mar. 24 Discrete morphological models
Introduction to discrete morphological models; Mk model; conditioning on variability. [slides]
Lab 9: Introduction to RevBayes
Tuesday Mar. 29 Polytomies; Pagel’s test
Polytomies and the star tree paradox; reversiblep-jump MCMC; Pagel’s (1994) test for correlated evolution. [polytomy slides] [Pagel slides]
Homework 10: Mk model and conditioning on variability
Thursday Mar. 31 Stochastic character mapping
An alternative to Pagel’s (1994) test for assessing whether correlation among characters goes beyond what is expected from inheritance alone. [slides] [additional slides]
Lab 10: RevBayes (discrete morphology analyses)
Tuesday Apr. 5 Evolutionary Correlation: Continuous Traits
Independent Contrasts [slides] [Brownian Motion applet] and Phylogenetic Generalized Least Squares (PGLS). [slides]
Homework 11: Maddison and Fitzjohn 2015
Thursday Apr. 7 PGLS (cont.)
Estimating ancestral states in PGLS. Ornstein-Uhlenbeck model vs. Brownian motion. [slides] [OU applet]
Lab 11: BayesTraits
Tuesday Apr. 12 Phylogenetic signal in continuous traits
Measuring the amount of phylogenetic information in continuous traits (Pagel’s lambda, Blomberg’s K). [slides] [Pagel transformation applet]
[Introduction to the coalescent
Introduction to coalescent theory [slides]
Homework 12: Brownian motion model
Thursday Apr. 14 Multispecies coalescent and species tree estimation
The multispecies coalescent used to estimate species trees given possibly conflicting gene trees due to deep coalescence, incomplete lineage sorting, and the anomaly zone. [slides]
Lab 12: Continuous trait analyses in R
Tuesday Apr. 19 Fast species tree methods
The SVDQuartets and ASTRAL species tree methods. [slides]
Divergence time estimation
Strict vs. relaxed clocks, correlated vs. uncorrelated relaxed clocks, calibrating the clock using fossils. [slides]
Homework 13: Heterotachy
Thursday Apr. 21 Diversification rate evolution
State-dependent diversification models (BiSSE and its descendants) [relaxed clocks part 2 slides] [diversification slides]
Lab 13: Divergence time estimation
Tuesday Apr. 26 Diversification (cont.), Heterotachy, and Covarion models
BAMM: estimating the number of shifts in diversification regime and where these occur on the tree; what is heterotachy and how can it be accommodated; the covarion hypothesis and model [slides]
no homework assignment
Thursday Apr. 28 Species delimitation and information
Bayesian species delimitation (BPP), Bayesian information content estimation [slides]
Lab 14: BAMM

Index to major topics

Index for 2022

Literature cited

Literature cited in 2022


Grading info

Books (and book chapters) on phylogenetics

This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.

Harmon, L. 2019. Phylogenetic comparative methods. (Version 1.4, released 15 March 2019). Published online by the author.

Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.

Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)

Garamszegi, L. Z. 2014. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer-Verlag, Berlin. (Well-written chapters by current leaders in phylogenetic comparative methods.)

Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)

Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)

Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)

Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)

Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapter 12: Applications of molecular systematics. (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)

Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis. 1996. Chapter 11: Phylogenetic inference. Pages 407-514 in Molecular Systematics (D. M. Hillis, C. Moritz, and B. K. Mable, eds.). Sinauer Associates, Sunderland, Massachusetts. (SOWH topology test)