Up to the Phylogenetics main page

In this homework assignment you will use what you know about Dirichlet process priors to compute the probability of each possible clustering of 3 genes.

Assume that there are 3 genes (A, B, and C), and that a Dirichlet process prior governs the clustering of genes according to preferred tree topology.

One possible configuration “seats” all 3 genes at the same “table” (using the restaurant metaphor). In this configuration, all 3 genes must share the same topology.

The configuration at the opposite extreme seats each of the 3 genes at its own table, and each table gets its own private tree topology. Note that the same “dish” (tree topology) can be served at different tables, so it is possible that all 3 genes share one topology despite sitting at different tables.

Assuming \(\alpha = 3\), answer the following two questions (and show your work so I can figure out what happened if you get a different answer than me):

Question 1

Calculate the prior probabilities of all possible configurations of the 3 genes under the Dirichlet process model. Note that slide 4 in the April 9 lecture notes provides everything you need to answer this question.

configuration probability
ABC  
BC|A  
AC|B  
AB|C  
A|B|C  

Question 2

What is the expected number of groups? (See slide 8 from the April 9 lecture notes.)