Abstract

Estimating the Tree of Life will likely involve a two-step procedure, where in the first step trees are estimated on many genes, and then the gene trees are combined into a tree on all the taxa. However, the true gene trees may not agree with with the species tree, due to biological processes such as deep coalescence, gene duplication and loss, and horizontal gene transfer. While methods have been developed to estimate species trees in the presence of incomplete lineage sorting, the relative accuracy of these methods compared to the usual "concatenation" approach is debated.

In this talk, I will present results showing that coalescent-based estimation methods are impacted by gene tree estimation error, so that they can be less accurate than concatenation in many cases. I will also present new methods for estimating species trees in the presence of gene tree conflict due to ILS that are more accurate than current methods. Key to these methods is addressing gene tree estimation error more effectively. I will also present results using these techniques to estimate species tree for several biological datasets, including the Avian Tree of Life.