Search This Blog

Tuesday, 7 September 2021

Missing the forest for the trees?

 

Phylogenetic Conflict Is Common and the “Hierarchy” Is Far from “Perfect”

Casey Luskin

As I discussed earlier, we were recently asked to comment on a video at FORA.tv. In the video, Richard Dawkins argues that the best way to refute “creationists” is to show them that genetic data forms “a perfectly hierarchy — a perfect family tree.” Let’s review again just how strongly he emphasizes this point in the video:

Compare the genes of any pair of animals you like — pair of animals, pair of plants — and then plot out the resemblances and they fall on a perfectly hierarchy — a perfect family tree.

It’s simply false for Dawkins to claim that when you compare genes of different animals, they “fall on a perfectly hierarchy — a perfect family tree.” The scientific literature is replete with conflicts among evolutionary trees, where phylogenetic analysis of different genes in the same group of plants, animals, or other organisms generate conflicting family trees. It also abounds with examples of where analyzing gene similarities in a group of organisms generates a tree that conflicts with the tree generated by analyzing similar anatomical characters in the same group of plants or animals in the fossil record. One paper in the journal Genome Research put it plainly, that “different proteins generate different phylogenetic tree[s].” 

Meyer Reviews the Literature

Stephen Meyer documents these sorts of papers and much more in Chapter 6 of Darwin’s Doubt, where he writes:

Just as the molecular data do not point unequivocally to a single date for the last common ancestor of all the Cambrian animals (the point of deep divergence), they do not point unequivocally to a single coherent tree depicting the evolution of animals in the Precambrian. Numerous papers have noted the prevalence of contradictory trees based on evidence from molecular genetics. A 2009 paper in Trends in Ecology and Evolution notes that “evolutionary trees from different genes often have conflicting branching patterns.” Likewise, a 2012 paper in Biological Reviews acknowledges that “phylogenetic conflict is common, and frequently the norm rather than the exception.” Echoing these views, a January 2009 cover story and review article in New Scientist observed that today the tree-of-life project “lies in tatters, torn to pieces by an onslaught of negative evidence.” As the article explains, “Many biologists now argue that the tree concept is obsolete and needs to be discarded,” because the evidence suggests that “the evolution of animals and plants isn’t exactly tree-like.”

The New Scientist article cited a study by Michael Syvanen, a biologist at the University of California at Davis, who studied the relationships among several phyla that first arose in the Cambrian. Syvanen’s study compared two thousand genes in six animals spanning phyla as diverse as chordates, echinoderms, arthropods, and nematodes. His analysis yielded no consistent tree-like pattern. As the New Scientist reported, “In theory, he should have been able to use the gene sequences to construct an evolutionary tree showing the relationships between the six animals. He failed. The problem was that different genes told contradictory evolutionary stories.” Syvanen himself summarized the results in the bluntest of terms: “We’ve just annihilated the tree of life. It’s not a tree anymore, it’s a different topology [pattern of history] entirely. What would Darwin have made of that?” 

Other studies trying to clarify the evolutionary history and phylogenetic relationships of the animal phyla have encountered similar difficulties. Vanderbilt University molecular systematist Antonis Rokas is a leader among biologists using molecular data to study animal phylogenetic relationships. Nevertheless, he concedes that a century and a half after The Origin of Species, “a complete and accurate tree of life remains an elusive goal.” In 2005, during the course of an authoritative study he eventually co-published in Science, Rokas was confronted with this stark reality. The study had sought to determine the evolutionary history of the animal phyla by analyzing fifty genes across seventeen taxa. He hoped that a single dominant phylogenetic tree would emerge. Rokas and his team reported that “a 50- gene data matrix does not resolve relationships among most metazoan phyla” because it generated numerous conflicting phylogenies and historical signals. Their conclusion was candid: “Despite the amount of data and breadth of taxa analyzed, relationships among most metazoan phyla remained unresolved.” 

In a paper published the following year, Rokas and University of Wisconsin at Madison biologist Sean B. Carroll went so far as to assert that “certain critical parts of the TOL [tree of life] may be difficult to resolve, regardless of the quantity of conventional data available.” This problem applies specifically to the relationships of the animal phyla, where “[m]any recent studies have reported support for many alternative conflicting phylogenies.” Investigators studying the animal tree found that “a large fraction of single genes produce phylogenies of poor quality” such that in one case, a study “omitted 35% of single genes from their data matrix, because those genes produced phylogenies at odds with conventional wisdom.” Rokas and Carroll tried to explain the many contradictory trees by proposing that the animal phyla might have evolved too quickly for the genes to record some signal of phylogenetic relationships into the respective genomes. In their view, if the evolutionary process responsible for anatomical novelty works quickly enough, there would not be sufficient time for differences to accumulate in key molecular markers, in particular those used to infer evolutionary relationships in different animal phyla. Then, given enough time, whatever signal did exist might become lost. Thus, when groups of organisms branch rapidly and then evolve separately for long periods of time, this “can overwhelm the true historical signal” — leading to the inability to determine evolutionary relationships. 

DARWIN’S DOUBT, PP. 120-121

“Phylogenomic Conflict”

Such conflicts in the grand tree of life have continued to mount. Recently I participated in a journal club discussion of a 2021 paper in Proceedings of the National Academy of Sciences (PNAS) titled “Phylogenomic conflict coincides with rapid morphological innovation.” The paper explores the abrupt appearance of new types of organisms — or as they put it, “episodes of rapid phenotypic innovation that underlie the emergence of major lineages.” The paper observes that rapid appearance of new types of organisms often coincides with “conflicts” among trees based upon different types of genes:

One insight gleaned from phylogenomics is that gene-tree conflict, frequently caused by population-level processes, is often rampant during the origin of major lineages. … Regions of high conflict often coincide with the emergence of major clades, such as mammals, angiosperms, and metazoa. … We demonstrate that instances of high gene-tree conflict (discordance in phylogenetic signal across genes) in mammals, birds, and several major plant clades correspond to rate increases in morphological innovation.

Conflicts that Should Not Exist

It is precisely these types of “conflicts” among gene-based trees that Dawkins says aren’t supposed to exist in the “perfect” (Dawkins’s word) tree of life. The PNAS paper goes on to make the intriguing observation that researchers often ignore conflicts between these trees as noise in the data, when in reality these conflicts may be telling us something very important about life’s history:

Most large-scale phylogenetic and phylogenomic studies meant to resolve species relationships have treated gene-tree discordance as an analytical nuisance to be filtered or accommodated. However, since phylogenomic conflict often represents the imprint of past population genetic processes on the genome, studying its correlation with other macroevolutionary patterns may shed light on the microevolutionary processes underlying major transitions across the Tree of Life.

Researchers have long recognized that rates of morphological evolution vary across the Tree of Life, with pronounced bursts in morphological change interspersed with periods of relative stasis. … Phylogenomic conflict often appears to coincide with important episodes of morphological differentiation among major lineages. For example, the major differences in life history and body plan that distinguish mammalian orders emerged rapidly among ancestral taxa following the Cretaceous-Paleogene (K-Pg) mass extinction. … The early avian radiation has proved similarly challenging and is notable for the rapid establishment of phenotypically and ecologically disparate lineages. Several large-scale studies using massive genomic datasets have revealed extensive conflict among phylogenetic branches coinciding with the early radiation of crown Aves. Since the origin of land plants, there have been numerous major phases of morphological and ecological innovation, ranging from the initial appearance of vascular plant body plans in the late Silurian to distinct phases of angiosperm radiation from the Cretaceous to the present. As with the vertebrate lineages, the origins of many major plant clades show elevated levels of phylogenomic conflict… 

Eliminate Unreliable Data?

Similarly a 2016 paper in Molecular Phylogenetics and Evolution warns that many evolutionary systematists ignore phylogenetic conflicts, seeing them as a “nuisance,” rather than a signal that is telling us something important about biological origins:

Biologists should therefore be more aware that “phylogenetic incongruence [is] a signal, rather than a problem” (Nakhleh, 2013) and treat it accordingly. In the case of the tree shrew, and many other lineages in vertebrate phylogenetics, different algorithms may yield different trees because of the mosaic nature of the data (Kumar et al., 2013) and the inability of a bifurcating tree to explain the patterns.

In a presentation, my friend and colleague Paul Nelson recently quoted a chapter from a textbook, Molecular Systematics, which contains a section heading titled “Eliminate Unreliable Data,” justifying the practice because the authors reassure themselves “It is unrealistic to think that subjectivity in a molecular systematic study can be entirely avoided.” That seems like a most unreassuringly way to reassure oneself. The chapter goes on to say that “the benefits of excluding clearly unreliable regions” — i.e., genes which yield trees that don’t fit the standard hierarchy — “however subjectively determined—outweigh the dangers.” To summarize, what you’re seeing here are admissions that conflicts among evolutionary trees are common (admissions that themselves are actually quite common), coupled with rarer admissions that data is sometimes eliminated or ignored simply because it doesn’t fit the standard tree.

My Vantage as a Skeptic

Returning to the 2021 PNAS paper, it finds a correlation between periods of rapid innovation and the degree of conflict in gene-based trees. From my vantage point, as a skeptic of universal common ancestry, conflicts between genes that seem to have appeared during periods of morphological innovation indicate that common ancestry is not what generates new types of organisms. The paper, of course, does not question common ancestry. Instead it invokes various ad hoc explanations for the conflicts, attributing these conflicts to population processes such a “changes in population size, rapid speciation, and incomplete lineage sorting.” 

Regardless of whether universal common ancestry is right or wrong, the point here is that conflict in phylogenetic trees is very common, and the genetic data is far from producing a “perfect hierarchy,” as Dawkins put it. In fact, phylogenetic conflict seems to be greatest precisely in genes associated with the appearance of new types of organisms in the history of life. Dawkins got this point wrong, and he got it wrong precisely because this sort of conflicting phylogenetic data is not what a standard neo-Darwinian model would lead one to expect!

No comments:

Post a Comment