Do Shared ERVs Support Common Ancestry?
Jonathan M. May 26, 2011 10:52 AM
In my previous article, I discussed the background of one of the most commonly made arguments for primate common ancestry. In this article, I want to examine the first of the three layers of evidence offered by a popular-level article written about this subject.
The author of the article under discussion tells us,
When we examine the collective genome of Homo sapiens, we find that a portion of it consists of ERVs (IHGS Consortium, 2001). We also find that humans share most of them with Chimpanzees, as well as the other members of Hominidae (great apes), the members of Hylobatidae (gibbons), and even the members of Cercopitheciodae (old world monkeys) (Kurdyukov et al., 2001; Lebedev et al., 2000; Medstrand and Mager, 1998; Anderssen et al., 1997; Steinhuber et al., 1995). Since humans don't and/or can't regularly procreate and have fertile offspring with members of these species, and thus don't make sizable contributions to their gene pools, and vice versa, their inheritance cannot have resulted from unions of modern species. As previously mentioned, parallel integration is ruled out by the highly random target selection of integrase. And even if it was far more target-specific than observed, it would require so many simultaneous insertion and endogenizations that the evolutionary model would still be tremendously more parsimonious. This leaves only one way an ERV could have been inherited: via sexual reproduction of organisms of a species that later diverged into the one the organisms that share the ERV belong to, i.e. an ancestral species--simply put, humans and the other primates must share common ancestry.
Just how target-specific are these ERV integrations? In the portion of the article headed "common creationist responses," we are told that,
...while proviral insertion is not purely random, it is also not locus specific; due to the way it directly attacks the 5' and 3' phosphodiester bonds, with no need to ligate (Skinner et al., 2001). So relative to pure randomness, insertion is non-random, but relative to locus specificity, insertion is highly random.
Really?
Let's take a few moments to do what any good student of biology would do -- and briefly survey some of the literature.
In one relevant study, Barbulescu et al. (2001) report that,
We identified a human endogenous retrovirus K (HERV-K) provirus that is present at the orthologous position in the gorilla and chimpanzee genomes, but not in the human genome. Humans contain an intact preintegration site at this locus. [emphasis added]
It seems that the most plausible explanation for this is an independent insert in the gorilla and chimpanzee lineages. Notice that the intact preintegration site at the pertinent locus in humans precludes the possibility of the HERV-K provirus having been inserted into the genome of the common ancestor of humans, chimpanzees and gorillas, and subsequently lost from the human genome by processes of genetic recombination. Though there are other possible candidate hypotheses for this observation (such as incomplete lineage sorting), in the context of other indications of locus-specific site preference, this data is, at the very least, suggestive that these inserts may in fact be independent events.
But there's more.
Another study, by Sverdlov (1998) reports,
But although this concept of retrovirus selectivity is currently prevailing, practically all genomic regions were reported to be used as primary integration targets, however, with different preferences. There were identified 'hot spots' containing integration sites used up to 280 times more frequently than predicted mathematically. [emphasis added]
In addition,Yohn et al. (2005) report that,
Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes.
I could continue in a similar vein for some time. Other classes of retroelement also show fairly specific target-site preferences. For example, Levy et al. (2009) report that Alu retroelements routinely preferentially insert into certain classes of already-present transposable elements, and do so with a specific orientation and at specific locations within the mobile element sequence. Moreover, a study published in Science by Li et al.(2009) found that, in the waterflea genome, introns routinely insert into the same loci, leading the internationally-acclaimed evolutionary biologist Michael Lynch to note,
Remarkably, we have found many cases of parallel intron gains at essentially the same sites in independent genotypes. This strongly argues against the common assumption that when two species share introns at the same site, it is always due to inheritance from a common ancestor.
Finally, Daniels and Deininger (1985) suggest that,
...a common mechanism exists for the insertion of many repetitive DNA families into new genomic sites. A modified mechanism for site-specific integration of primate repetitive DNA sequences is provided which requires insertion into dA-rich sequences in the genome. This model is consistent with the observed relationship between galago Type II subfamilies suggesting that they have arisen not by mere mutation but by independent integration events.
Such target-site preferences are also documented here, here, and here.
Why might these ERV site-preferences exist? Presumably because these sites are most conducive to their successful reproduction (e.g. the necessitude for expression of the ERV's regulatory elements; the activity of the host's DNA correction system, etc). Mitchell et al. (2004) suggest "that virus-specific binding of integration complexes to chromatin features likely guides site selection."
Out of tens of thousands of ERV elements in the human genome, roughly how many are known to occupy the same sites in humans and chimpanzees? According to this Talk-Origins article, at least seven. Let's call it less than a dozen. Given the sheer number of these retroviruses in our genome (literally tens of thousands), and accounting for the evidence of integration preferences and site biases which I have documented above, what are the odds of finding a handful of ERV elements which have independently inserted themselves into the same locus?
A Nested Hierarchy?
What about this "nested hierarchy" of which we are told?
We are (incorrectly) told that "There is only one, solitary known deviation of the distributional nested hierarchy; a relatively recently endogenized/fixed ERV called HERV-K-GC1."
This claim, however, is false.
In addition to the case mentioned, Yohn et al. (2005) report:
We performed two analyses to determine whether these 12 shared map intervals might indeed be orthologous. First, we examined the distribution of shared sites between species (Table S3). We found that the distribution is inconsistent with the generally accepted phylogeny of catarrhine primates. This is particularly relevant for the human/great ape lineage. For example, only one interval is shared by gorilla and chimpanzee; however, two intervals are shared by gorilla and baboon; while three intervals are apparently shared by macaque and chimpanzee. Our Southern analysis shows that human and orangutan completely lack PTERV1 sequence (see Figure 2A). If these sites were truly orthologous and, thus, ancestral in the human/ape ancestor, it would require that at least six of these sites were deleted in the human lineage. Moreover, the same exact six sites would also have had to have been deleted in the orangutan lineage if the generally accepted phylogeny is correct. Such a series of independent deletion events at the same precise locations in the genome is unlikely (Figure S3).
[...]
Several lines of evidence indicate that chimpanzee and gorilla PTERV1 copies arose from an exogenous source. First, there is virtually no overlap (less than 4%) between the location of insertions among chimpanzee, gorilla, macaque, and baboon, making it unlikely that endogenous copies existed in a common ancestor and then became subsequently deleted in the human lineage and orangutan lineage. Second, the PTERV1 phylogenetic tree is inconsistent with the generally accepted species tree for primates, suggesting a horizontal transmission as opposed to a vertical transmission from a common ape ancestor. An alternative explanation may be that the primate phylogeny is grossly incorrect, as has been proposed by a minority of anthropologists.
As irritating to the evolutionary model as it might be, there are, in fact, a significant number of deviations from the orthodox phylogeny.
In the final part of this blog series, I will discuss the argument based on "shared mistakes" in these ERV elements, as well as the argument based on degrees of mutational divergence between the retroviral 5' and 3' long terminal repeats (LTRs).
Jonathan M. May 26, 2011 10:52 AM
In my previous article, I discussed the background of one of the most commonly made arguments for primate common ancestry. In this article, I want to examine the first of the three layers of evidence offered by a popular-level article written about this subject.
The author of the article under discussion tells us,
When we examine the collective genome of Homo sapiens, we find that a portion of it consists of ERVs (IHGS Consortium, 2001). We also find that humans share most of them with Chimpanzees, as well as the other members of Hominidae (great apes), the members of Hylobatidae (gibbons), and even the members of Cercopitheciodae (old world monkeys) (Kurdyukov et al., 2001; Lebedev et al., 2000; Medstrand and Mager, 1998; Anderssen et al., 1997; Steinhuber et al., 1995). Since humans don't and/or can't regularly procreate and have fertile offspring with members of these species, and thus don't make sizable contributions to their gene pools, and vice versa, their inheritance cannot have resulted from unions of modern species. As previously mentioned, parallel integration is ruled out by the highly random target selection of integrase. And even if it was far more target-specific than observed, it would require so many simultaneous insertion and endogenizations that the evolutionary model would still be tremendously more parsimonious. This leaves only one way an ERV could have been inherited: via sexual reproduction of organisms of a species that later diverged into the one the organisms that share the ERV belong to, i.e. an ancestral species--simply put, humans and the other primates must share common ancestry.
Just how target-specific are these ERV integrations? In the portion of the article headed "common creationist responses," we are told that,
...while proviral insertion is not purely random, it is also not locus specific; due to the way it directly attacks the 5' and 3' phosphodiester bonds, with no need to ligate (Skinner et al., 2001). So relative to pure randomness, insertion is non-random, but relative to locus specificity, insertion is highly random.
Really?
Let's take a few moments to do what any good student of biology would do -- and briefly survey some of the literature.
In one relevant study, Barbulescu et al. (2001) report that,
We identified a human endogenous retrovirus K (HERV-K) provirus that is present at the orthologous position in the gorilla and chimpanzee genomes, but not in the human genome. Humans contain an intact preintegration site at this locus. [emphasis added]
It seems that the most plausible explanation for this is an independent insert in the gorilla and chimpanzee lineages. Notice that the intact preintegration site at the pertinent locus in humans precludes the possibility of the HERV-K provirus having been inserted into the genome of the common ancestor of humans, chimpanzees and gorillas, and subsequently lost from the human genome by processes of genetic recombination. Though there are other possible candidate hypotheses for this observation (such as incomplete lineage sorting), in the context of other indications of locus-specific site preference, this data is, at the very least, suggestive that these inserts may in fact be independent events.
But there's more.
Another study, by Sverdlov (1998) reports,
But although this concept of retrovirus selectivity is currently prevailing, practically all genomic regions were reported to be used as primary integration targets, however, with different preferences. There were identified 'hot spots' containing integration sites used up to 280 times more frequently than predicted mathematically. [emphasis added]
In addition,Yohn et al. (2005) report that,
Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes.
I could continue in a similar vein for some time. Other classes of retroelement also show fairly specific target-site preferences. For example, Levy et al. (2009) report that Alu retroelements routinely preferentially insert into certain classes of already-present transposable elements, and do so with a specific orientation and at specific locations within the mobile element sequence. Moreover, a study published in Science by Li et al.(2009) found that, in the waterflea genome, introns routinely insert into the same loci, leading the internationally-acclaimed evolutionary biologist Michael Lynch to note,
Remarkably, we have found many cases of parallel intron gains at essentially the same sites in independent genotypes. This strongly argues against the common assumption that when two species share introns at the same site, it is always due to inheritance from a common ancestor.
Finally, Daniels and Deininger (1985) suggest that,
...a common mechanism exists for the insertion of many repetitive DNA families into new genomic sites. A modified mechanism for site-specific integration of primate repetitive DNA sequences is provided which requires insertion into dA-rich sequences in the genome. This model is consistent with the observed relationship between galago Type II subfamilies suggesting that they have arisen not by mere mutation but by independent integration events.
Such target-site preferences are also documented here, here, and here.
Why might these ERV site-preferences exist? Presumably because these sites are most conducive to their successful reproduction (e.g. the necessitude for expression of the ERV's regulatory elements; the activity of the host's DNA correction system, etc). Mitchell et al. (2004) suggest "that virus-specific binding of integration complexes to chromatin features likely guides site selection."
Out of tens of thousands of ERV elements in the human genome, roughly how many are known to occupy the same sites in humans and chimpanzees? According to this Talk-Origins article, at least seven. Let's call it less than a dozen. Given the sheer number of these retroviruses in our genome (literally tens of thousands), and accounting for the evidence of integration preferences and site biases which I have documented above, what are the odds of finding a handful of ERV elements which have independently inserted themselves into the same locus?
A Nested Hierarchy?
What about this "nested hierarchy" of which we are told?
We are (incorrectly) told that "There is only one, solitary known deviation of the distributional nested hierarchy; a relatively recently endogenized/fixed ERV called HERV-K-GC1."
This claim, however, is false.
In addition to the case mentioned, Yohn et al. (2005) report:
We performed two analyses to determine whether these 12 shared map intervals might indeed be orthologous. First, we examined the distribution of shared sites between species (Table S3). We found that the distribution is inconsistent with the generally accepted phylogeny of catarrhine primates. This is particularly relevant for the human/great ape lineage. For example, only one interval is shared by gorilla and chimpanzee; however, two intervals are shared by gorilla and baboon; while three intervals are apparently shared by macaque and chimpanzee. Our Southern analysis shows that human and orangutan completely lack PTERV1 sequence (see Figure 2A). If these sites were truly orthologous and, thus, ancestral in the human/ape ancestor, it would require that at least six of these sites were deleted in the human lineage. Moreover, the same exact six sites would also have had to have been deleted in the orangutan lineage if the generally accepted phylogeny is correct. Such a series of independent deletion events at the same precise locations in the genome is unlikely (Figure S3).
[...]
Several lines of evidence indicate that chimpanzee and gorilla PTERV1 copies arose from an exogenous source. First, there is virtually no overlap (less than 4%) between the location of insertions among chimpanzee, gorilla, macaque, and baboon, making it unlikely that endogenous copies existed in a common ancestor and then became subsequently deleted in the human lineage and orangutan lineage. Second, the PTERV1 phylogenetic tree is inconsistent with the generally accepted species tree for primates, suggesting a horizontal transmission as opposed to a vertical transmission from a common ape ancestor. An alternative explanation may be that the primate phylogeny is grossly incorrect, as has been proposed by a minority of anthropologists.
As irritating to the evolutionary model as it might be, there are, in fact, a significant number of deviations from the orthodox phylogeny.
In the final part of this blog series, I will discuss the argument based on "shared mistakes" in these ERV elements, as well as the argument based on degrees of mutational divergence between the retroviral 5' and 3' long terminal repeats (LTRs).