Defending Douglas Axe on the Rarity of Protein Folds
In 2000 and 2004, writing in the Journal of Molecular Biology, current Discovery Institute Senior Fellow Douglas Axe published seminal papers on the rarity of protein folds. Axe studied the beta-lactamase enzyme in E. coli and found that the likelihood of a chance sequence of 153 amino acids generating the stable, functional fold needed for the larger domain in that enzyme was as low as 1 in 1077. Axe conducted this research while a post-doc at the at the Centre for Protein Engineering (CPE) in Cambridge. In his book Unbelievable: How Biology Confirms Our Intuition that Life Was Designed, Axe explains that there are serious consequences for his research.
His critics have not failed to notice that — including theologian Rope Kojonen, whom we’ll return to shortly. In a series here, we have been responding to Kojonen’s conception of design in nature. The following examination and defense of Axe serves as a direct, empirical test of Kojonen’s design hypothesis.
A Confrontation with Alan Fersht
In a suspenseful passage from his book, Dr. Axe describes what happened to him when his post-doc advisor, Alan Fersht, confronted him about his affinities for intelligent design:
I was the first person in the lab one morning in February of 2002. Alan usually made his rounds through the labs later in the day when work was in full swing, but on this morning he dropped in early to have a word with me. He seemed tense. He approached me as if there were a pressing matter he needed to discuss, yet he seemed unable to initiate the discussion.
After mentioning that he had just listened to a BBC radio program discussing intelligent design, Alan put a few questions to me, somewhat awkwardly.
“You know this William Dembski fellow, don’t you?”
“Yes.”
“And you know about his intelligent design theory.”
“Yes.”
“Tell me, then, who is the designer?”
A Sign of Trouble
Axe goes on to note that “Alan’s questioning didn’t seem to lead anywhere on that February morning.” But his advisor interrogating him about his personal religious beliefs was definitely a sign of trouble, and also exposed the confusion and innate bias that many people have about intelligent design. Axe continues to discuss what was really going behind the scenes:
Years later, an article in New Scientist magazine about Biologic Institute (titled “The God Lab”) revealed that one of my fellow scientists at the CPE had been pressing Alan to dismiss me because of my connection to ID. The article says Alan refused to do so, quoting him as saying, “I have always been fairly easy-going about people working in the lab. I said I was not going to throw him out. What he was doing was asking legitimate questions about how a protein folded.” According to the article, I left the CPE after “Axe and Fersht were in dispute with each other over the implications of work going on in Fersht’s lab.”
The truth is that Alan did, in the end, give in to the internal whistle-blower who wanted me removed, though I certainly accept his account of having resisted this for some time. When he did finally act, I interpreted the awkwardness of his action as an indication of his reluctance. There was no heart-to-heart conversation or even a word spoken face-to-face. When everyone gathered in the customary way to bid me farewell, Alan was conspicuously absent. All I received was an e-mail from Alan’s assistant on the eleventh of March 2002, succinctly stating that the CPE was “very short of [lab] bench space” and declaring Alan’s solution: “Please vacate as soon as possible and by the end of March latest.”
Scientific Objections to Axe’s Research
So, Axe’s research on protein sequence rarity seems to have gotten him expelled from the Centre for Protein Engineering in Cambridge. But this was only to be the first incident where people didn’t like his results. Quite a few critics have raised scientific objections to Axe’s research over the years. In our recent paper “On the Relationship between Design and Evolution,” reviewing Rope Kojonen’s book The Compatibility of Evolution and Design, we and our co-authors (Stephen Dilley and Emily Reeves) assess what those critics have said and why they got things wrong. As readers of this series will know, we critique Kojonen’s thoughtful attempt to harmonize mainstream evolutionary theory with his particular version of design. Here’s the relevant section from our paper:
Several studies demonstrate that, for many proteins, functional sequences occupy an exceedingly small proportion of physically possible amino acid sequences. For example, Axe (2000, 2004)’s work on the larger beta-lactamase protein domain indicates that only 1 in 1077 sequences are functional — astonishingly rare indeed. Such rarity presents prima facie evidence that many proteins are very difficult to evolve by a blind evolutionary process of random mutation and natural selection.
Of course, a common rejoinder to this data is to claim that ‘protein rarity’ is only true for select proteins; many others are not so rare. That is, many proteins might have sequences with functions that are more common in sequence space and are thereby easier to evolve. As Kojonen (2021, p. 119) puts it, “others argue that functional proteins are much more common”. He specifically cites Tian and Best (2017) as a rebuttal to Axe (2004) on this point. Similarly, Venema (2018) objects to Axe (2004)’s research because he believes “functional proteins are not rare within sequence space”. Importantly, Kojonen is correct that some proteins are easier to evolve than others, and this point is pressed by some scientists17 — but nonetheless, a very large proportion of proteins seems beyond the reach of mutation and selection.
Indeed, Tian and Best (2017) present much data that actually support Axe’s general thesis for protein rarity. They reported that the functional probabilities for ten protein domains range from 1 in 1024 to 1 in 10126. Yet even if we grant generous assumptions towards evolution, additional research indicates that only three of the ten domains studied by Tian and Best could have possibly emerged through an undirected evolutionary search of sequence space. Specifically, Chatterjee et al. (2014) calculated that there are at most 1038 trials available over the entire history of life on Earth to evolve a new protein. Therefore, if a protein domain has a probability of less than 10−38, then it is unlikely to emerge via a process of random mutation and natural selection. Seven of the ten domains studied by Tian and Best (2017) had probabilities below 10−38. Thus, even though Kojonen (2021, p. 119) cites Tian and Best (2017) to argue that the “specificity required for achieving a functional amino acid sequence” may be less for some proteins, their research provides strong empirical evidence that many proteins have functional sequences that are so rare as to be beyond the reach of standard evolutionary mechanisms.
Kojonen (2021, p. 119) also cites Taylor et al. (2001) to counter (or mitigate) Axe’s results on protein rarity. Taylor et al. (2001) reported that the probability of evolving a chorismate mutase enzyme is 1 in 1023, which Kojonen (2021) takes to suggest that functional protein sequences can be “more common than in the case of the protein studied by Axe”. Yet the fact that chorismate mutase represents less rare sequences is unsurprising given that its function requires a simpler fold than typical enzymes such as beta-lactamase studied by Axe (2004).18 Could chorismate mutase evolve? If it could, this still does not demonstrate the feasibility of Kojonen’s thesis: the possibility that some simpler proteins could evolve does not mean that all (or even most) more complex proteins could evolve. But for [Kojonen’s model] to succeed, evolutionary mechanisms must be up to the task in all cases, not just some.
The possibility of evolving relatively simpler proteins, however, raises another objection. Hunt (2007) asks: If a simple protein could evolve in the first place, might it also evolve further into a more complex protein? More specifically, if one assumes that a comparatively simple protein such as chorismate mutase could evolve, why could it not also evolve into a more elaborate protein, including one with a functional sequence that is as rare as those studied by Axe?19
Writing here recently, Brian Miller used an easy-to-grasp analogy to illustrate why a simple protein could not evolve into a more complex protein of even modest rarity. See, “Proteins Are Rare and Isolated — And Thus, Cannot Evolve.”
Responding to Dennis Venema
One of Axe’s critics, Dennis Venema, discusses intrinsically disordered proteins. We respond:
For example, Venema (2018) cites intrinsically disordered proteins (IDPs), noting they “do not need to be stably folded in order to function” and therefore represent a type of protein with sequences that are less tightly constrained and are presumably therefore easier to evolve. Yet IDPs fulfill fundamentally different types of roles (e.g., binding to multiple protein surfaces) compared to the proteins with well-defined structures that Axe (2004) studied (e.g., crucial enzymes involved in catalyzing specific reactions). Axe (2018) also responds by noting that Venema (2018) understates the complexity of IDPs. Axe (2018) points out that IDPs are not entirely unfolded, and “a better term” would be to call them “conditionally folded proteins”. Axe (2018) further notes that a major review paper on IDPs cited by Venema (2018) shows that IDPs are capable of folding — they can undergo “coupled folding and binding”; there is a “mechanism by which disordered interaction motifs associate with and fold upon binding to their targets” (Wright and Dyson 2015). That paper further notes that IDPs often do not perform their functions properly after experiencing mutations, suggesting they have sequences that are specifically tailored to their functions: “mutations in [IDPs] or changes in their cellular abundance are associated with disease” (Wright and Dyson 2015). In light of the complexity of IDPs, Axe (2018) concludes:
“If Venema (2018) pictures these conditional folders as being easy evolutionary onramps for mutation and selection to make unconditionally folded proteins, he’s badly mistaken. Both kinds of proteins are at work in cells in a highly orchestrated way, both requiring just the right amino-acid sequences to perform their component functions, each of which serves the high-level function of the whole organism. (Axe 2018)”
Venema (2018) also argues that functional proteins are easy to evolve. He cites Neme et al. (2017), a team that genetically engineered E. coli to produce a ∼500 nucleotide RNA (150 of which are random) that encode a 62 amino-acid protein (50 of which are random). The investigators reported that 25% of the randomized sequences enhance the cell’s growth rate. Unfortunately, they misinterpreted their results — a fact pointed out by Weisman and Eddy (2017), who raised “reservations about the correctness of the conclusion of Neme et al. that 25% of their random sequences have beneficial effects”. Here is why they held those reservations: the investigators in Neme et al. (2017) did not compare the growth of cells containing inserted genetic code with normal bacteria but rather with cells that carry a “zero vector” — a stretch of DNA that generates a fixed 350 nucleotide RNA (the randomized 150 nucleotides are excluded from this RNA). Weisman and Eddy (2017) explain how the zero vector “is neither empty nor innocuous”, since it produces a “a 38 amino-acid open reading frame at high levels” of expression. Yet since this “zero vector” and its transcripts provide no benefit to the bacterium, its high expression wastes cellular resources, which, as Weisman and Eddy (2017) note, “is detrimental to the E. coli host”. The reason the randomized peptide sometimes provided a relative benefit to the E. coli bacteria is because, in some cases (25%), it was probably interfering with production of the “zero vector” transcript and/or protein, thus sparing the E. coli host from wasting resources. As Weisman and Eddy (2017) put it, it is “easy to imagine a highly expressed random RNA or protein sequence gumming up the works somehow, by aggregation or otherwise interfering with some cellular component”. Axe (2018) responds to Neme et al. (2017) this way:
“Any junk that slows the process of making more junk by gumming up the works a bit would provide a selective benefit. Such sequences are “good” only in this highly artificial context, much as shoving a stick into an electric fan is “good” if you need to stop the blades in a hurry.”
In other words, at the molecular level, this random protein was not performing some complex new function but rather was probably interfering with its own RNA transcription and/or translation — a “devolutionary” hypothesis consistent with Michael Behe’s thesis that evolutionarily advantageous features often destroy or diminish function at the molecular level (Behe 2019). In any case, what Neme et al. (2017) showed is that a quarter of the randomized sequences were capable of inhibiting E. coli from expressing this “zero vector”, but they provided no demonstrated benefit to unmodified normal bacteria.
Finally, Venema (2018) cites Cai et al. (2008) to argue for the de novo origin of a yeast protein, BSC4, purportedly showing that “new genes that code for novel, functional proteins can pop into existence from sequences that did not previously encode a protein”. However, the paper provides no calculations about the rarity of the protein’s sequence nor its ability to evolve by mutation and selection. Rather, the evidence for this claim is entirely inferred, indirect, and based primarily upon the limited taxonomic range of the gene, which led the authors to infer it was newly evolved. Axe (2018) offers an alternative interpretation:
“The observable facts are what they are: brewers’ yeast has a gene that isn’t found intact in similar yeast species and appears to play a back-up role of some kind. The question is how to interpret these facts. And this is where Venema and I take different approaches. … Other interpretations of the facts surrounding BSC4 present themselves, one being that similar yeast species used to carry a similar gene which has now been lost. The fact that the version of this gene in brewers’ yeast is interrupted by a stop codon that reduces full-length expression to about 9 percent of what it would otherwise be seems to fit better with a gene on its way out than a gene on its way in.”
A Counterexample to Axe’s Research
In our paper, we elaborate on the enzyme chorismite mutase which was cited by Kojonen as a counterexample to Axe’s research. We first explain that the functional complexity of chorismite mutase really is not comparable to the beta-lactamase enzyme studied by Axe:
The function of chorismate mutase is to catalyze the conversion of chorismate to prephenate through amino acid side chains in its active site, thereby restricting chorismate’s conformational degrees of freedom. Essentially, it is merely providing a chamber or cavity that holds a particular molecule captive, thereby limiting that molecule’s ability to change. In contrast, beta-lactamase requires the precise positioning and orientation of amino acid side chains from separate domains that contribute to hydrolyzing the peptide bond of the characteristic four-membered beta-lactam ring. This function requires a more complex fold compared to chorismate mutase. Axe (2004) specifically compares beta-lactamase to chorismate mutase and notes that the beta-lactamase fold “is made more complex by its larger size, and by the number of structural components (loops, helices, and strands) and the degree to which formation of these components is intrinsically coupled to the formation of tertiary structure (as is generally the case for strands and loops, but not for helices)”.
We then elaborate on why Kojonen’s attempts to invoke special “fine-tuning” to allow the evolution of proteins like chorismite mutase could actually cause problems for the evolvability of other proteins. “No Free Lunch” theorems suggest that it’s very difficult to imagine a fine-tuning scenario that would globally assist in the evolution of all types of proteins. That is because biasing to allow the evolvability of one type of protein would likely make it more difficult to evolve other types of proteins:
Kojonen tries to overcome this problem by arguing that the physical properties of proteins are “finely-tuned” to bias the clustering of functional sequences such that a very narrow path could extend to complex proteins with rare functional sequences. The biasing would result in the prevalence of functional sequences along a path to a new protein being much higher than in other regions of sequence space. But such biasing could not possibly assist the evolution of most proteins. Biasing in the distribution of functional sequences in sequence space due to physical laws is arguably subject to the same constraints as the biasing in play in the algorithms employed by evolutionary search programs. Consequently, protein evolution falls under “No Free Lunch” theorems that state that no algorithm will in general find targets (e.g., novel proteins) any faster than a random search. An algorithm might assist in finding one target (e.g., specific protein), but it would just as likely hinder finding another (Miller 2017; Footnote 12). Thus, although Kojonen acknowledges that proteins are sometimes too rare to have directly emerged from a random search, he fails to appreciate the extent to which rarity necessitates isolation and why this must often pose a barrier to further protein evolution. Different proteins have completely different compositions of amino acids, physical properties, conformational dynamics, and functions. Any biasing that might assist in the evolution of one protein would almost certainly oppose the evolution of another. In other words, the probability of a continuous path leading to some proteins would be even less likely than if the distribution of functional sequences were random.
We consider this to be one of the most comprehensive collections of responses to Axe’s critics published to date and we hope our paper is useful in that regard.
The Bigger Picture
Stepping back, it may be helpful to say brief a word about how our defense of Axe fits into the overall argument in our Religions article. A key feature of Kojonen’s model is his claim that, in order for evolution to successfully produce biological complexity, it must rely on “fine-tuned” preconditions (and smooth fitness landscapes). These preconditions (and landscapes) are part of the “design” aspect of Kojonen’s model: in his view, God designed the laws of nature, which gave rise to fine-tuned preconditions and landscapes that in turn allow evolution to succeed.
If, as Kojonen claims, there really are fine-tuned preconditions and smooth fitness landscapes, then they should be empirically detectable. One can analyze, for example, whether functional protein folds can evolve into different functional protein folds by means of natural processes such as the mutation-selection mechanism. Douglas Axe’s work — along with the work of other scientists — shows that this is implausible. Proteins cannot evolve in this way. Kojonen’s empirical claim is false. Thus, his specific claims about design are false. The universe does not have the fine-tuned preconditions and smooth landscapes that his model says arose from the activity of a Designer.
Thus, our main point is not to criticize evolution per se. Yet because of the way Kojonen frames the issue, it turns out that the same evidence that poses problems for his understanding of design also raises problems for mainstream evolutionary theory. In a sense, two birds fall with one stone.