Search This Blog

Showing posts with label Intelligent design. Show all posts
Showing posts with label Intelligent design. Show all posts

Thursday 7 March 2024

A match made in heaven?

 Can the Laws of Nature Design Life? Emily Reeves Considers the Compatibility of Evolution and ID


Can intelligent design and evolution work together? It’s an intriguing idea that is welcomed by some, but does the scientific evidence support it? On a new episode of ID the Future, host Casey Luskin speaks with Dr. Emily Reeves to discuss her contribution to a recent paper critiquing theologian Rope Kojonen’s proposal that mainstream evolutionary biology and intelligent design have worked in harmony to produce the diversity of life we see on earth. 

Dr. Reeves starts by summarizing the Compatibility of Evolution and Design (CED) argument before also summarizing her team’s response to it. “CED is a great work of scholarship,” says Reeves, “but I think its relevance really hinges on whether empirical evidence supports Kojonen’s version of how the design is implemented within evolutionary theory, and then, of course, whether design arguments…are really compatible with evolutionary theory.” 

Reeves and Luskin go on to critique Dr. Kojonen’s conception of design. His model posits that the laws of nature have been front-loaded with design by an intelligent designer. But laws are not creative forces on their own – they only describe forces already in action. There’s no empirical evidence that the laws of nature could do the type of heavy lifting required to steer evolutionary processes toward success. As an example, Dr. Reeves describes how the law of gravity interacts with a growing plant. Gravity is used as a cue in the plant’s biology, but it doesn’t power the plant’s ability to grow. Download the podcast or listen to it here.

Wednesday 6 March 2024

ID is superstition masquerading as science?

 Are Proponents of ID Religiously Motivated, and Does It Matter?


Recently, someone asked me to comment on an article, published in 2017, by John Danaher, a lecturer in the Law School at the University of Galway, Ireland. He is widely published on legal and moral philosophy, as well as philosophy of religion. In his article, Danaher alleges that proponents of intelligent design (ID) are religiously motivated. He also asserts that the argument for ID from irreducible complexity has conceptual problems, and that systems that we deem to be irreducibly complex can be adequately explained by co-optation of components performing other roles in the cell. In two articles, I will address his concerns about our supposed religious motives, and then tackle his specific objections to irreducible complexity.

Do We Have Religious Motives?

Danaher opens his essay by reminiscing about his days as a student when he first encountered ID.

When I was a student, well over a decade ago now, intelligent design was all the rage. It was the latest religiously-inspired threat to Darwinism (though it tried to hide its religious origins). It argued that Darwinism could never account for certain forms of adaptation that we see in the natural world. 

What made intelligent design different from its forebears was its seeming scientific sophistication. Proponents of intelligent design were often well-qualified scientists and mathematicians, and they dressed up their arguments with the latest findings from microbiology and abstruse applications of probability theory. My sense is that the fad for intelligent design has faded in the intervening years, though I have no doubt that it still has its proponents.

These paragraphs betray the fact that the author is quite out of touch with the literature on ID. 

Stronger than Ever

First, ID has come a long way since the early 2000s. Far from having faded, it is now stronger than ever, having more academic proponents (and many more peer-reviewed publications) than at any time in its history. Its arguments are far more developed and sophisticated than in the early 2000s and this trend is likely to continue. 

Second, it is unclear in what sense Danaher refers to the “religious origins” of ID. It is certainly true that having a religious perspective, predisposing one towards theism, creates a plausibility structure that opens one’s mind to the possibility of there being measurable evidence of design in the universe, including in living organisms. Thus, being independently persuaded of the truth of a theistic religion (in my case, Christianity) is positively relevant to one’s assessment of the prior probability (or, intrinsic plausibility) of ID. However, even if one is not persuaded of theistic religion, the evidence of design in the natural world is, in my opinion, sufficient to overwhelm even a very low prior. Indeed, the cosmological evidence that our universe has a finite history; the fine-tuning of the laws and constants of our universe; the prior environmental fitness of nature for complex life; the optimization of the universe for scientific discovery and technology; and the biological evidence of design all point univocally and convergently in the direction of a cosmic creator. Thus, ID has attracted support from scholars who are not themselves adherents of any religion, including Michael Denton, David Berlinski, and Steve Fuller. Paleontologist and frequent Evolution News contributor Günter Bechly, though a Christian believer now, was not sympathetic to Christianity when he first came to be persuaded of ID.

Misguided on Many Levels

Later in the essay, Danaher further remarks

The claim is not that God must have created the bacterial flagellum but, rather, that an intelligent designer did. For tactical reasons, proponents of intelligent design liked to hide their religious motivations, trying to claim that their theory was scientific, not religious in nature. This was largely done in order to get around certain legal prohibitions on the teaching of religion under US constitutional law. I’m not too interested in that here though. I view the intelligent design movement as a religious one, and hence the arguments they proffer as on a par with pretty much all design arguments.

These comments are misguided on many levels.

First, the claim that we ID proponents are not clear about our personal religious persuasions is patently false. Speaking for myself, I have been very clear that I am a Christian theist, though my grounds for being persuaded of that conclusion are wholly independent of the science of ID. And I am by no means unusual. Virtually every leading ID proponent — from Michael Behe to William Dembski to Stephen Meyer to Phillip Johnson to David Klinghoffer to Casey Luskin to Brian Miller to Ann Gauger and many others — has been totally open about his or her personal religious beliefs. In the world of intelligent design, no one is hiding anything about religious beliefs, including those who lack religious beliefs.

Second, ID is a scientific argument, and when evaluating a scientific argument, the motives of its proponents are irrelevant. As Casey Luskin writes,

[I]n science, the motives or personal religious beliefs of scientists don’t matter; only the evidence matters. For example, the great scientists Johannes Kepler and Isaac Newton were inspired to their scientific work by their religious convictions that God would create an orderly, rational universe with comprehensible physical laws that governed the motion of the planets. They turned out to be right — not because of their religious beliefs — but because the scientific evidence validated their hypotheses. (At least, Newton was thought to be right until Einstein came along.) Their personal religious beliefs, motives, or affiliations did nothing to change the fact that their scientific theories had inestimable scientific merit that helped form the foundation for modern science.

To attack an idea because of the alleged religious motives of its proponents is to commit the genetic fallacy, and that is exactly what Danaher has done here.

Third, ID is not a religious argument. Though ID provides strong evidence for a broadly theistic perspective, the argument itself is grounded in the scientific method. ID does not aid in evaluating the merits of one particular religious tradition over another. ID does not even technically commit one to theism, though I would contend that God is the best candidate for the identity of the designer (as Stephen Meyer argues in his recent book, Return of the God Hypothesis). Thus, ID rightly attracts people of all religious persuasions and none (including Orthodox Jews, Muslims, and agnostics). This is important because it shows that ID is not about supporting one particular religion. We, therefore, strive to be honest about the limitations of ID while being careful not to overstate what the scientific evidence alone can tell us.

What About Evolution?

Finally, if Danaher wants to scrutinize the religious motives of ID proponents, we have to consider what such a line of attack would do to evolution. Casey Luskin has documented (see here or here) the extensive anti-religious beliefs, motives, and affiliations of many leading evolution-advocates. While I (and Luskin) would maintain that evolution is science, one must ask what would happen to evolution if the religious (or anti-religious) beliefs of its proponents suddenly became relevant to assessing its merits.

“Teach the Controversy”

Danaher’s statement that the claim that ID is scientific and not religious “was largely done in order to get around certain legal prohibitions on the teaching of religion under US constitutional law” is historically incorrect. Discovery Institute (the leading organization funding research into, and promoting the public understanding of, ID) does not support attempts to legally protect the teaching of ID in public schools. In fact, since Discovery Institute’s earliest involvement in major public education debates in the U.S. (in Ohio in 2002), it has not supported mandating the teaching of ID in public schools. This is not because we feel that ID is unconstitutional. ID, much like the Big Bang in cosmology, may be friendly to a broadly theistic perspective. However, this does not make the idea itself a religious one, just as the Big Bang theory is not a religious idea. Thus, there is nothing intrinsic to ID that would render it unconstitutional under the First Amendment. However, attempts to legislatively protect the teaching of ID tend to politicize the theory, and we believe that the merits of ID ought to be debated in the scientific journals, not in the courtroom. Rather, Discovery Institute advocates a “teach the controversy” model, where the strengths and weaknesses of scientific theories (including evolution) are presented and discussed. All of this is stated clearly and openly on our Science Education policy page:

As a matter of public policy, Discovery Institute opposes any effort to require the teaching of intelligent design by school districts or state boards of education. Attempts to require teaching about intelligent design only politicize the theory and will hinder fair and open discussion of the merits of the theory among scholars and within the scientific community. Furthermore, most teachers at the present time do not know enough about intelligent design to teach about it accurately and objectively. 

Instead of recommending teaching about intelligent design in public K-12 schools, Discovery Institute seeks to increase the coverage of evolution in curriculum. It believes that evolution should be fully and completely presented to students, and they should learn more about evolutionary theory, including its unresolved issues. In other words, evolution should be taught as a scientific theory that is open to critical scrutiny, not as a sacred dogma that can’t be questioned.

Thus, Danaher is ill-informed about Discovery Institute’s long-standing education policy. In a second article, I shall address his specific concerns regarding the argument from irreducible complexity.

Monday 4 March 2024

A theory of everything re:design detection? V

 Orgelian Specified Complexity


As I noted at the start of this series on “specified complexity,” which I’m concluding today, Leslie Orgel introduced that term in his 1973 book The Origins of Life. Although specified complexity as developed by Winston Ewert, Robert Marks, and me attempts to get at the same informational reality that Orgel was trying to grasp, our formulations differ in important ways. 

For a fuller understanding of specified complexity, as an appendix to the series, it will therefore help to review what Orgel originally had in mind and to see where our formulation of the concept improves on his. Strictly speaking, this subject is mainly of historical interest. Because The Origins of Life is out of print and hard to get, I will quote from it extensively, offering exegetical commentary. I will focus on the three pages of his book where Orgel introduces and then discusses specified complexity (pages 189–191). 

"Terrestrial Biology”

Orgel introduces the term “specified complexity” in a section titled “Terrestrial Biology.” Elsewhere in his book, Orgel also considers non-terrestrial biology, which is why the title of his book refers to the origins (plural) of life — radically different forms of life might arise in different parts of the universe. To set the stage for introducing specified complexity, Orgel discusses the various commonly cited defining features of life, such reproduction or metabolism. Thinking these don’t get at the essence of life, he introduces the term that is the focus of this series:

It is possible to make a more fundamental distinction between living and nonliving things by examining their molecular structure and molecular behavior. In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple, well-specified structures because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures which are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. (p. 189)

So far, so good. Everything Orgel writes here makes good intuitive sense. It matches up with the three types of order discussed at the start of this series: repetitive order, random order, complex specified order. Wanting to put specified complexity on a firmer theoretical basis, Orgel next connects it to information theory:

These vague ideas can be made more precise by introducing the idea of information. Roughly speaking, the information content of a structure is the minimum number of instructions needed to specify the structure. One can see intuitively that many instructions are needed to specify a complex structure. On the other hand, a simple repeating structure can be specified in rather few instructions. Complex but random structures, by definition. need hardly be specified at all. (p. 190)

Orgel’s elaboration here of specified complexity calls for further clarification. His use of the term “information content” is ill-defined. He unpacks it in terms of “minimum number of instructions needed to specify a structure.” This suggests a Kolmogorov information measure. Yet complex specified structures, according to him, require lots of instructions, and so suggest high Kolmogorov information. By contrast, specified complexity as developed in this series requires low Kolmogorov information. 

At the same time, for Orgel to write that “complex but random structures … need hardly be specified at all” suggests low Kolmogorov complexity for random structures, which is exactly the opposite of how Kolmogorov information characterizes randomness. For Kolmogorov, the random structures are those that are incompressible, and thus, in Orgel’s usage, require many instructions to specify (not “need hardly be specified at all”). 

Perhaps Orgel had something else in mind — I am trying to read him charitably — but from the vantage of information theory, his options are limited. Shannon and Kolmogorov are, for Orgel, the only games in town. And yet, Shannon information, focused as it is on probability rather than instruction sets, doesn’t clarify Orgel’s last remarks. Fortunately, Orgel elaborates on them with three examples:

These differences are made clear by the following example. Suppose a chemist agreed to synthesize anything that could be described accurately to him. How many instructions would he need to make a crystal, a mixture of random DNA-like polymers or the DNA of the bacterium E. coli? (p. 190)

This passage seems promising for understanding what Orgel is getting at with specified complexity. Nonetheless, it also suggests that Orgel is understanding information entirely in terms of instruction sets for building chemical systems, which then weds him entirely to a Kolmogorov rather than Shannon view of information. In particular, nothing here suggests that he will bring both views of information together under a coherent umbrella. 

The Language of Short Descriptions

Here’s is how Orgel elaborates the first example, which is replete with the language of short descriptions (as in the account of specified complexity given in this series):

To describe the crystal we had in mind, we would need to specify which substance we wanted and the way in which the molecules were to be packed together in the crystal. The first requirement could be conveyed in a short sentence. The second would be almost as brief, because we could describe how we wanted the first few molecules packed together, and then say “and keep on doing the same.” Structural information has to be given only once because the crystal is regular. (p. 190)

This example has very much the feel of our earlier example in which Kolmogorov information was illustrated in a sequence of 100 identical coin tosses (0 for tails) described very simply by “repeat ‘0’ 100 times.” For specified complexity as developed in this series, an example like this one by Orgel yields a low degree of specified complexity. It combines both low Shannon information (the crystal forms reliably and repeatedly with high probability and thus low complexity) and low Kolmogorov information (the crystal requires a short description of instruction set). It exhibits specified non-complexity, or what could be called specified simplicity.

A Fatal Difficulty

Orgel’s next example, focused on randomness, is more revealing, and indicates a fatal difficulty with his approach to specified complexity:

It would be almost as easy to tell the chemist how to make a mixture of random DNA-like polymers. We would first specify the proportion of each of the four nucleotides in the mixture. Then, we would say, “Mix the nucleotides in the required proportions, choose nucleotide molecules at random from the mixture, and join them together in the order you find them.” In this way the chemist would be sure to make polymers with the specified composition, but the sequences would be random. (p. 190)

Orgel’s account of forming random polymers here betrays information-theoretic confusion. Previously, he was using the terms “specify” and “specified” in the sense of giving a full instruction set to bring about a given structure — in this case, a given nucleotide polymer. But that’s not what he is doing here. Instead, he is giving a recipe for forming random nucleotide polymers in general. Granted, the recipe is short (i.e., bring together the right separate ingredients and mix), suggesting a short description length since it would be “easy” to tell a chemist how to produce it. 

But the synthetic chemist here is producing not just one random polymer but a whole bunch of them. And even if the chemist produced a single such polymer, it would not be precisely identified. Rather, it would belong to a class of random polymers. To identify and actually build a given random polymer would require a large instructional set, and would thus indicate high, not low Kolmogorov information, contrary to what Orgel is saying here about random polymers.

Finally, let’s turn to the example that for Orgel motivates his introduction of the term “specified complexity” in the first place:

It is quite impossible to produce a corresponding simple set of instructions that would enable the chemist to synthesize the DNA of E. coli. In this case, the sequence matters: only by specifying the sequence letter-by-letter (about 4,000,000 instructions) could we tell the chemist what we wanted him to make. The synthetic chemist would need a book of instructions rather than a few short sentences. (p. 190)

Orgel’s Takeaway

Given this last example, it becomes clear that for Orgel, specified complexity is all about requiring a long instructional set to generate a structure. Orgel’s takeaway, then, is this:

It is important to notice that each polymer molecule on a random mixture has a sequence just as definite as that of E. coli DNA. However, in a random mixture the sequences are not specified. Whereas in E. coli, the DNA sequence is crucial. Two random mixtures contain quite different polymer sequences, but the DNA sequences in two E. coli cells are identical because they are specified. The polymer sequences are complex but random: although E. coli DNA is also complex, it is specified In a unique way. (pp. 190–191)

This is confused. The reason it’s confused is that Orgel’s account of specified complexity commits a category mistake. He admits that a random sequence requires just as long an instruction set to generate as E. coli DNA because both are, as he puts it, “definite.” Yet with random sequences, he looks at an entire class or range of random sequences whereas with E. coli DNA, he is looking at one particular sequence. 

Orgel is correct, as far as he goes, that from an instruction set point of view, it’s easy to generate elements from such a class of random sequences. And yet, from an instruction set point of view, it is no easier to generate a particular random sequence than a particular non-random sequence, such as E. coli DNA. That’s the category mistake. Orgel is applying instruction sets in two very different ways, one to a class of sequences, the other to particular sequences. But he fails to note the difference. 

A Different Tack

The approach to specified complexity that Winston Ewert and I take, as characterized in this series, takes a different tack. Repetitive order yields high probability and specification, and therefore combines low Shannon and low Kolmogorov information, yielding, as we’ve seen, what can be called specified simplicity. This is consistent with Orgel. But note that our approach yields a specified complexity value (albeit a low one in this case). Specified complexity, as a difference between Shannon and Kolmogorov complexity, takes continuous values and thus comes in degrees. For repetitive order, specified complexity, as characterized in this series, will thus take on low values.

That said, Orgel’s application of specified complexity to distinguish a random nucleotide polymer from E. coli DNA diverges sharply from how specified complexity as outlined in this series applies to these same polymers. A random sequence, within the scheme outlined in the series, will have large Shannon information but also, because it has no short description, will have large Kolmogorov information, so the two will cancel each other, and the specified complexity of such a sequence will be low or indeterminate.

On the other hand, for E. coli DNA, within the scheme outlined in this series, there will be work to do in showing that it actually exhibits specified complexity. The problem is that the particular sequence in question will have low probability and thus high Shannon information. At the same time, that particular sequence will be unlikely to have a short exact description. Rather, what will be needed to characterize the E. coli DNA as exhibiting specified complexity within the scheme of this series is a short description to which the sequence answers but which also describes an event of small probability, thus combining high Shannon information with low Kolmogorov information. 

Specified complexity as characterized in this series and applied to this example will thus mean that the description will include not just the particular sequence in question but a range of sequences that answer to the description. Note that there is no category mistake here as there was with Orgel. The point of specified complexity as developed in this series is always with matching events and descriptions of those events, where any particular event is described provided it answers to the description. For instance, a die rolls exhibiting a 6 answers to the description “an even die roll.”

So, is there a simple description of the E. coli DNA that shows this sequence to exhibit specified complexity in the sense outlined in this series? That’s in fact not an easy question to answer. The truth of Darwinian evolution versus intelligent design hinges on the answer. Orgel realized this when he wrote the following immediately after introducing the concept of specified complexity, though his reference to miracles is a red herring (at issue is whether life is the result of intelligence, and there’s no reason to think that intelligence as operating in nature need act miraculously):

Since, as scientists, we must not postulate miracles we must suppose that the appearance of “life” is necessarily preceded by a period of evolution. At first, replicating structures are formed that have low but non-zero information content. Natural selection leads to the development of a series of structures of increasing complexity and information content, until one is formed which we are prepared to call “living.” (p. 192)

Orgel is here proposing the life evolves to increasing levels of complexity, where at each stage nothing radically improbable is happening. Natural selection is thus seen as a probability amplifier that renders probable what otherwise would be improbable. Is there a simple description to which the E. coli DNA answers and which is highly improbable, not just when the isolated nucleotides making up the E. coli DNA are viewed as a purely random mixture but rather by factoring in their evolvability via Darwinian evolution?

A Tough Question

That’s a tough question to answer precisely because evaluating the probability of forming E. coli DNA with or without natural selection is far from clear. Given Orgel’s account of specified complexity, he would have to say that the E. coli DNA exhibits specified complexity. But within the account of specified complexity given in this series, ascribing specified complexity always requires doing some work, finding a description to which an observed event answers, showing the description to be short, and showing the event precisely identified by the description has small probability, implying high Shannon information and low Kolmogorov information. 

For intelligent design in biology, the challenge in demonstrating specified complexity is always to find a biological system that can be briefly described (yielding low Kolmogorov complexity) and whose evolvability, even by Darwinian means, has small probability (yielding high Shannon information). Orgel’s understanding of specified complexity is quite different. In my view, it is not only conceptually incoherent but also stacks the deck unduly in favor of Darwinian evolution. 

To sum up, I have presented Orgel’s account of specified complexity at length so that readers can decide for themselves which account of specified complexity they prefer, Orgel’s or the one presented in this series.

Editor’s note: This article appeared originally at BillDembski.com




Saturday 2 March 2024

A theory of everything re:design detection? IV

 Life and the Underlying Principle Behind the Second Law of Thermodynamics


Author’s note: If you trust your own common sense (recommended), you can just watch the short (6 minute) video “Evolution Is a Natural Process Running Backward” and save yourself some time. Or watch the short video “A Mathematician’s View of Evolution.” Otherwise, read on.


Extremely Improbable Events

The idea that what has happened on Earth seems to be contrary to the more general statements of the second law of thermodynamics is generally rebutted1 by noting that the Earth is an open system, and the second law only applies to isolated systems.

Nevertheless, the second law is all about probability and there is something about the origin and evolution of life, and the development of human intelligence and civilization, that appears to many to defy the spirit, if not the letter, of the second law even if the Earth is an open system. There seems to be something extraordinarily improbable about life.

In a 2000 Mathematical Intelligencer article2 I claimed that:

The second law of thermodynamics — at least the underlying principle behind this law — simply says that natural forces do not cause extremely improbable things to happen, and it is absurd to argue that because the Earth receives energy from the Sun, this principle was not violated here when the original rearrangement of atoms into encyclopedias and computers occurred.

One reader noted in a published reply3 to my article that any particular long string of coin tosses is extremely improbable, so my statement that “natural forces do not cause extremely improbable things to happen” is not correct. This critic was right, and I have since been careful to state (for example in a 2013 BIO-Complexity article4) that the underlying principle behind the second law is that

Natural (unintelligent) forces do not do macroscopically describable things that are extremely improbable from the microscopic point of view. 

Extremely improbable events must be macroscopically (simply) describable to be forbidden; if we include extremely improbable events that can only be described by an atom-by-atom (or coin-by-coin) accounting, there are so many of these that some are sure to happen. But if we define an event as “macroscopically describable” when it can be described in m or fewer bits, there are at most 2m macroscopically describable events. Then if we do 2k experiments and define an event as “extremely improbable” if it has probability less than 1/2n we can set the probability threshold for an event to be considered “extremely improbable” so low (n >> k+m) that we can be confident that no extremely improbable, macroscopically describable events will ever occur. And with 1023 molecules in a mole, almost anything that is extremely improbable from the microscopic point of view will be impossibly improbable. If we flip a billion fair coins, any particular outcome we get can be said to be extremely improbable, but we are only astonished if something extremely improbable and simply (macroscopically) describable happens, such as “only prime number tosses are heads” or “the last million coins are tails.”

Temperature and diffusing carbon distribute themselves more and more randomly (more uniformly) in an isolated piece of steel because that is what the laws of probability at the microscopic level predict: it would be extremely improbable for either to distribute itself less randomly, assuming nothing is going on but diffusion and heat conduction. The laws of probability dictate that a digital computer, left to the forces of nature, will eventually degrade into scrap metal and it is extremely improbable that the reverse process would occur, because of all the arrangements atoms could take, only a very few would be able to do logical and arithmetic operations. 

This principle is very similar to William Dembski’s observation5 that you can identify intelligent agents because they are the only ones that can do things that are “specified” (simply or macroscopically describable) and “complex” (extremely improbable). Any box full of wires and metal scraps could be said to be complex, but we only suspect intelligence has organized them if the box performs a complex and specifiable function, such as “playing DVDs.”

Extension to Open Systems
So does the origin and evolution of life, and the development of civilization, on a previously barren planet violate the more general statements of the second law of thermodynamics? It is hard to imagine anything that more obviously and spectacularly violates the underlying principle behind the second law than the idea that four fundamental, unintelligent, forces of physics alone could rearrange the fundamental particles of physics into computers, science texts, nuclear power plants, and smart phones. The most common reply to this observation is that all current statements of the second law apply only to isolated systems, for example, “In an isolated system, the direction of spontaneous change is from an arrangement of lesser probability to an arrangement of greater probability” and “In an isolated system, the direction of spontaneous change is from order to disorder.”6

Although the second law is really all about probability, many people try to avoid that issue by saying that evolution does not technically violate the above statements of the second law because the Earth receives energy from the sun, so it is not an isolated system. But in the above-referenced BIO-Complexity article4 and again in a 2017 Physics Essays article7 I pointed out that the basic principle underlying the second law does apply to open systems; you just have to take into account what is crossing the boundary of an open system in deciding what is extremely improbable and what is not. In both I generalized the second statement cited above6 to:

If an increase in order is extremely improbable when a system is isolated, it is still extremely improbable when the system is open, unless something is entering which makes it not extremely improbable.

Then in Physics Essays7 I illustrated this tautology by showing that the entropy associated with any diffusing component X (if X is diffusing heat, this is just thermal entropy) can decrease in an open system, but no faster than it is exported through the boundary. Since this “X-entropy” measures disorder in the distribution of X, we can say that the “X-order” (defined as the negative of X-entropy) can increase in an open system, but no faster than X-order is imported through the boundary.

In this analysis the rate of change of thermal entropy (S) was defined as usual by:




where Q is heat energy and T is absolute temperature, and the rate of change of X-entropy (Sx) was defined similarly by:


where C is the density (concentration) of X. In these calculations (which, remember, are just illustrating a tautology) I again assumed that nothing was going on but diffusion and heat conduction (diffusion of heat). I had first published this analysis in my reply “Can ANYTHING Happen in an Open System?”8 to critics of my Mathematical Intelligencer article2 and again in an appendix of a 2005 John Wiley text, The Numerical Solution of Ordinary and Partial Differential Equations9and again in Biological Information: New Perspectives.10

Everyone agrees that, in an isolated system, natural forces will never reorganize scrap metal into digital computers, because this is extremely improbable. If the system is open, it is still extremely improbable that computers will appear, unless something is entering the system from outside which makes the appearance of computers not extremely improbable. For example, computers.


Application to Our Open System

Now let’s consider just one of many events that have occurred on Earth (and only here, it appears) that seem to be extremely improbable: “From a lifeless planet, there arose spaceships capable of flying to its moon and back safely.” This is certainly macroscopically describable, but is it extremely improbable from the microscopic point of view? You can argue that it only seems extremely improbable, but it really isn’t. You can argue that a few billion years ago a simple self-replicator formed by natural chemical processes, and that over the millions of years natural selection was able to organize the duplication errors made by these self-replicators into intelligent, conscious, humans, who were able to build rockets that could reach the moon and return safely. 

I would counter that we with all our advanced technology are still not close to designing any self-replicating machine11; that is still pure science fiction. When you add technology to such a machine, to bring it closer to the goal of reproduction, you only move the goal posts, as now you have a more complicated machine to reproduce. So how could we believe that such a machine could have arisen by pure chance? And suppose we did somehow manage to design, say, a fleet of cars with fully automated car-building factories inside, able to produce new cars, and not just normal new cars, but new cars with fully automated car-building factories inside them. Who could seriously believe that if we left these cars alone for a long time, the accumulation of duplication errors made as they reproduced themselves would result in anything other than devolution, and eventually could even be organized by selective forces into more advanced automobile models? So I would claim that we don’t really understand how living things are able to pass their complex structures on to their descendants without significant degradation, generation after generation, much less how they evolve even more complex structures.

Many have argued12 that the fine-tuning for life of the laws and constants of physics can be explained by postulating a large or infinite number of universes, with different laws and constants. So some might be tempted to argue that if our universe is large enough, or if there are enough other universes, the development of interplanetary spaceships might occur on some Earth-like planets even if extremely improbable. But if you have to appeal to this sort of argument to explain the development of civilization, the second law becomes meaningless, as a similar argument could be used to explain any violation of the second law, including a significant decrease in thermal entropy in an isolated system.

Conclusions

There are various ways to argue that what has happened on Earth does not violate the more general statements of the second law as found in physics texts. The “compensation” argument, which says that “entropy” can decrease in an open system as long as the decrease is compensated by equal or greater increases outside, so that the total entropy of any isolated system containing this system still increases, is perhaps the most widely used1. Since in this context “entropy” is just used as a synonym for “disorder,” the compensation argument, as I paraphrased it in Physics Essays7, essentially says that extremely improbable things can happen in an open system as long as things are happening outside which, if reversed, would be even more improbable! This compensation argument is not valid even when applied just to thermal entropy, as the decrease in an open system is limited not by increases outside, but by the amount exported through the boundary, as I’ve shown7, 8, 9, 10. Since tornados derive their energy from the sun, the compensation argument could equally well be used to argue that a tornado running backward, turning rubble into houses and cars, would not violate the second law either.

But there is really only one logically valid way to argue that what has happened on Earth does not violate the fundamental principle underlying the second law — the one principle from which every application and every statement of this law draws its authority. And that is to say that it only seems impossibly improbable, but it really is not, that under the right conditions, the influx of stellar energy into a planet could cause atoms there to rearrange themselves into nuclear power plants and digital computers and encyclopedias and science texts, and spaceships that could travel to other planets and back safely.

And although the second law is all about probability, very few Darwinists are willing to make such an argument; they prefer to avoid the issue of probability altogether.

Friday 1 March 2024

A theory of everything re:design detection?II

 Specified Complexity and a Tale of Ten Malibus


Yesterday in my series on specified complexity, I promised to show how all this works with an example of cars driving along a road. The example, illustrating what a given value of specified complexity means, is adapted from section 3.6 of the second edition of The Design Inference, from which I quote extensively. Suppose you witness ten brand new Chevy Malibus drive past you on a public road in immediate, uninter­rupted succession. The question that crosses your mind is this: Did this succession of ten brand new Chevy Malibus happen by chance?

Your first reaction might be to think that this event is a publicity stunt by a local Chevy dealership. In that case, the succession would be due to design rather than to chance. But you don’t want to jump to that conclusion too quickly. Perhaps it is just a lucky coincidence. But if so, how would you know? Perhaps the coincidence is so improbable that no one should expect to observe it as happening by chance. In that case, it’s not just unlikely that you would observe this coincidence by chance; it’s unlikely that anyone would. How, then, do you determine whether this succession of identical cars could reasonably have resulted by chance?

Obviously, you will need to know how many opportunities exist to observe this event. It’s estimated that in 2019 there were 1.4 billion motor vehicles on the road worldwide. That would include trucks, but to keep things simple let’s assume all of them are cars. Although these cars will appear on many different types of roads, some with traffic so sparse that ten cars in immediate succession would almost never happen, to say nothing of ten cars having the same late make and model, let’s give chance every opportunity to succeed by assuming that all these cars are arranged in one giant succession of 1.4 billion cars arranged bumper to bumper.

But it’s not enough to look at one static arrangement of all these 1.4 billion cars. Cars are in motion and continually rearranging themselves. Let’s therefore assume that the cars completely reshuffle themselves every minute, and that we might have the opportunity to see the succession of ten Malibus at any time across a hundred years. In that case, there would be no more than 74 quadrillion opportunities for ten brand new Chevy Malibus to line up in immediate, uninterrupted succession.

So, how improbable is this event given these 1.4 billion cars and their repeated reshuffling? To answer this question requires knowing how many makes and models of cars are on the road and their relative proportions (let’s leave aside how different makes are distributed geographically, which is also relevant, but introduces needless complications for the purpose of this illustration). If, per impossibile, all cars in the world were brand new Chevy Malibus, there would be no coincidence to explain. In that case, all 1.4 billion cars would be identical, and getting ten of them in a row would be an event of probability 1 regardless of reshuffling.

But Clearly, Nothing Like That Is the Case

Go to Cars.com, and using its car-locater widget you’ll find 30 popular makes and over 60 “other” makes of vehicles. Under the make of Chevrolet, there are over 80 models (not counting variations of models — there are five such variations under the model Malibu). Such numbers help to assess whether the event in question happened by chance. Clearly, the event is specified in that it answers to the short description “ten new Chevy Malibus in a row.” For the sake of argument, let’s assume that achieving that event by chance is going to be highly improbable given all the other cars on the road and given any reasonable assumptions about their chance distribution.

But there’s more work to do in this example to eliminate chance. No doubt, it would be remarkable to see ten new Chevy Malibus drive past you in immediate, uninterrupted succession. But what if you saw ten new red Chevy Malibus in a row drive past you? That would be even more striking now that they all also have the same color. Or what about simply ten new Chevies in a row? That would be less striking. But note how the description lengths covary with the probabilities: “ten new red Chevy Malibus in a row” has a longer description length than “ten new Chevy Malibus in a row,” but it corresponds to an event of smaller probability than the latter. Conversely, “ten new Chevies in a row” has shorter description length than “ten new Chevy Malibus in a row,” but it corresponds to an event of larger probability than the latter.

What we find in examples like this is a tradeoff between description length and probability of the event described (a tradeoff that specified complexity models). In a chance elimination argument, we want to see short description length combined with small probability (implying a larger value of specified complexity). But typically these play off against each other. “Ten new red Chevy Malibus in a row” corresponds to an event of smaller probability than “ten new Chevy Malibus in a row,” but its description length is slightly longer. Which event seems less readily ascribable to chance (or, we might say, worthier of a design inference)? A quick intuitive assess­ment suggests that the probability decrease outweighs the increase in description length, and so we’d be more inclined to eliminate chance if we saw ten new red Chevy Malibus in a row as opposed to ten of any color.

The lesson here is that probability and description length are in tension, so that as one goes up the other tends to go down, and that to eliminate chance both must be suitably low. We see this tension by contrasting “ten new Chevy Malibus in a row” with “ten new Chevies in a row,” and even more clearly with simply “ten Chevies in a row.” The latter has a shorter description length (lower description length) but also much higher probability. Intuitively, it is less worthy of a design inference because the increase in probability so outweighs the decrease in description length. Indeed, ten Chevies of any make and model in a row by chance doesn’t seem farfetched given the sheer number of Chevies on the road, certainly in the United States.

But There’s More

Why focus simply on Chevy Malibus? What if the make and model varied, so that the cars in succession were Honda Accords or Porsche Carreras or whatever? And what if the number of cars in succession varied, so it wasn’t just 10 but also 9 or 20 or whatever? Such questions underscore the different ways of specifying a succession of identical cars. Any such succession would have been salient if you witnessed it. Any such succession would constitute a specification if the description length were short enough. And any such succession could figure into a chance elimination argument if both the description length and the probability were low enough. A full-fledged chance-elimination argument in such circumstances would then factor in all relevant low-probability, low-description-length events, balancing them so that where one is more, the other is less.  

All of this can, as we by now realize, be recast in information-theoretic terms. Thus, a probability decrease corresponds to a Shannon information increase, and a description length increase corresponds to a Kolmogorov information increase. Specified complexity, as their difference, now has the following property (we assume, as turns out to be reasonable, that some fine points from theoretical computer science, such as the Kraft inequality, are approximately applicable): if the specified complexity of an event is greater than or equal to n bits, then the grand event consisting of all events with at least that level of specified complexity has probability less than or equal to 2^(–n). This is a powerful result and it provides a conceptually clean way to use specified complexity to eliminate chance and infer design. 

Essentially, what specified complexity does is consider an archer with a number of arrows in his quiver and a number of targets of varying size on a wall, and asks what is the probability that any one of these arrows will by chance land on one of these targets. The arrows in the quiver correspond to complexity, the targets to specifications. Raising the number 2 to the negative of specified complexity as an exponent then becomes the grand probability that any of these arrows will hit any of these targets by chance. 

Conclusion

Formally, the specified complexity of an event is the difference between its Shannon information and its Kolmogorov information. Informally, the specified complexity of an event is a combination of two properties, namely, that the event has small probability and that it has a description of short length. In the formal approach to specified complexity, we speak of algorithmic specified complexity. In the informal approach, we speak of intuitive specified complexity. But typically it will be clear from context which sense of the term “specified complexity” is intended.

In this series, we’ve defined and motivated algorithmic specified complexity. But we have not provided actual calculations of it. For calculations of algorithmic specified complexity as applied to real-world examples, I refer readers to sections 6.8 and 7.6 in the second edition of The Design Inference. Section 6.8 looks at general examples whereas section 7.6 looks at biological examples. In each of these sections, my co-author Winston Ewert and I examine examples where specified complexity is low, not leading to a design inference, and also where it is high, leading to a design inference.

For instance, in section 6.8 we take the so-called “Mars face,” a naturally occurring structure on Mars that looks like a face, and contrast it with the faces on Mount Rushmore. We argue that the specified complexity of the Mars face is too small to justify a design inference but that the specified complexity of the faces on Mount Rushmore is indeed large enough to justify a design inference.

Similarly, in section 7.6, we take the binding of proteins to ATP, as in the work of Anthony Keefe and Jack Szostak, and contrast it with the formation of protein folds in beta-lactamase, as in the work of Douglas Axe. We argue that the specified complexity of random ATP binding is close to 0. In fact, we calculate a negative value of the specified complexity, namely, –4. On the other hand, for the evolvability of a beta-lactamase fold, we calculate a specified complexity of 215, which corresponds to a probability of 2^(–215), or roughly a probability of 1 in 10^65. 

With all these numbers, we estimate a Shannon information and a Kolmogorov information and then calculate a difference. The validity of these estimates and the degree to which they can be refined can be disputed. But the underlying formalism of specified complexity is rock solid. The details of that formalism and its applications go beyond a series titled “Specified Complexity Made Simple.” Those details can all be found in the second edition of The design inference

A theory of everything re:design detection?

 Specified Complexity as a Unified Information Measure


With the publication of the first edition of my book The Design Inference and its sequel No Free Lunch, elucidating the connection between design inferences and information theory became increasingly urgent. That there was a connection was clear. The first edition of The Design Inference sketched, in the epilogue, how the relation between specifications and small probability (complex) events mirrored the transmission of messages along a communication channel from sender to receiver. Moreover, in No Free Lunch, both Shannon and Kolmogorov information were explicitly cited in connection with specified complexity — which is the subject of this series.

But even though specified complexity as characterized back then employed informational ideas, it did not constitute a clearly defined information measure. Specified complexity seemed like a kludge of ideas from logic, statistics, and information. Jay Richards, guest-editing a special issue of Philosophia Christi, asked me to clarify the connection between specified complexity and information theory. In response, I wrote an article titled “Specification: The Pattern That Signifies Intelligence,” which appeared in that journal in 2005

A Single Measure

In that article, I defined specified complexity as a single measure that combined under one roof all the key elements of the design inference, notably, small probability, specification, probabilistic resources, and universal probability bounds. Essentially, in the measure I articulated there, I attempted to encapsulate the entire design inferential methodology within a single mathematical expression. 

In retrospect, all the key pieces for what is now the fully developed informational account of specified complexity were there in that article. But my treatment of specified complexity there left substantial room for improvement. I used a counting measure to enumerate all the descriptions of a given length or shorter. I then placed this measure under a negative logarithm. This gave the equivalent of Kolmogorov information, suitably generalized to minimal description length. But because my approach was so focused on encapsulating the design-inferential methodology, the roles of Shannon and Kolmogorov information in its definition of specified complexity were muddied. 

My 2005 specified complexity paper fell stillborn from the press, and justly so given its lack of clarity. Eight years later, Winston Ewert, working with Robert Marks and me at the Evolutionary Informatics Lab, independently formulated specified complexity as a unified measure. It was essentially the same measure as in my 2005 article, but Ewert clearly articulated the place of both Shannon and Kolmogorov information in the definition of specified complexity. Ewert, along with Marks and me as co-authors, published this work under the title “Algorithmic Specified Complexity,” and then published subsequent applications of this work (see the Evolutionary Informatics Lab publications page). 

With Ewert’s lead, specified complexity, as an information measure, became the difference between Shannon information and Kolmogorov information. In symbols, the specified complexity SC for an event E was thus defined as SC(E) = I(E) – K(E). The term I(E) in this equation is just, as we saw in my last article, Shannon information, namely, I(E) = –log(P(E)), where P(E) is the probability of E with respect to some underlying relevant chance hypothesis. The term K(E) in this equation, in line with the last article, is a slight generalization of Kolmogorov information, in which for an event E, K(E) assigns the length, in bits, of the shortest description that precisely identifies E. Underlying this generalization of Kolmogorov information is a binary, prefix-free, Turing complete language that maps descriptions from the language to the events they identify. 

Not Merely a Kludge

There’s a lot packed into this last paragraph, so explicating it all is not going to be helpful in an article titled “Specified Complexity Made Simple.” For the details, see Chapter 6 of the second edition of The Design Inference. Still, it’s worth highlighting a few key points to show that SC, so defined, makes good sense as a unified information measure and is not merely a kludge of Shannon and Kolmogorov information. 

What brings Shannon and Kolmogorov information together as a coherent whole in this definition of specified complexity is event-description duality. Events (and the objects and structures they produce) occur in the world. Descriptions of events occur in language. Thus, corresponding to an event E are descriptions D that identify E. For instance, the event of getting a royal flush in the suit of hearts corresponds to the description “royal flush in the suit of hearts.” Such descriptions are, of course, never unique. The same event can always be described in multiple ways. Thus, this event could also be described as “a five-card poker hand with an ace of hearts, a king of hearts, a queen of hearts, a jack of hearts, and a ten of hearts.” Yet this description is quite a bit longer than the other. 

Given event-description duality, it follows that: (1) an event E with a probability P(E) has Shannon information I(E), measured in bits; moreover, (2) given a binary language (one expressed in bits — and all languages can be expressed in bits), for any description D that identifies E, the number of bits making up D, which in the last section we defined as |D|, will be no less than the Kolmogorov information of E (which measures in bits the shortest description that identifies E). Thus, because K(E) ≤ |D|, it follows that SC(E) = I(E) – K(E) ≥ I(E) – |D|. 

The most important take away here is that specified complexity makes Shannon information and Kolmogorov information commensurable. In particular, specified complexity takes the bits associated with an event’s probability and subtracts from it the bits associated with their minimum description length. Moreover, in estimating K(E), we then use I(E) – |D| to form a lower bound for specified complexity. It follows that specified complexity comes in degrees and could take on negative values. In practice, however, we’ll say an event exhibits specified complexity if it is positive and large (with what it means to be large depending on the relevant probabilistic resources). 

The Kraft Inequality

There’s a final fact that makes specified complexity a natural information measure and not just an arbitrary combination of Shannon and Kolmogorov information, and that’s the Kraft inequality. To apply the Kraft inequality of specified complexity here depends on the language that maps descriptions to events being prefix-free. Prefix-free languages help to ensure disambiguation, so that one description is not the start of another description. This is not an onerous condition, and even though it does not hold for natural languages, transforming natural languages into prefix-free languages leads to negligible increases in description length (again, see Chapter 6 of the second edition of The Design Inference). 

What the Kraft inequality does for the specified complexity of an event E is guarantee that all events having the same or greater specified complexity, when considered jointly as one grand union, nonetheless have probability less than or equal to 2 raised to the negative power of the specified complexity. In other words, the probability of the union of all events F with specified complexity no less than that of E (i.e., SC(F) ≥ SC(E)), will have probability less than or equal to 2^(–SC(E)). This result, so stated, may not seem to belong in a series of articles attempting to make specified complexity simple. But it is a big mathematical result, and it connects specified complexity to a probability bound that’s crucial for drawing design inferences. To illustrate how this all works, let’s turn next to an example of cars driving along a road.

We know it when we see it?

 Intuitive Specified Complexity: A User-Friendly Account


Even though this series is titled “Specified Complexity Made Simple,” there’s a limit to how much the concept of specified complexity may be simplified before it can no longer be adequately defined or explained. Accordingly, specified complexity, even when made simple, will still require the introduction of some basic mathematics, such as exponents and logarithms, as well as an informal discussion of information theory, especially Shannon and Kolmogorov information. I’ll get to that in the subsequent posts. 

At this early stage in the discussion, however, it seems wise to lay out specified complexity in a convenient non-technical way. That way, readers lacking mathematical and technical facility will still be able to grasp the gist of specified complexity. Here, I’ll present an intuitively accessible account of specified complexity. Just as all English speakers are familiar with the concept of prose even if they’ve never thought about how it differs from poetry, so too we are all familiar with specified complexity even if we haven’t carefully defined it or provided a precise formal mathematical account of it. 

In this post I’ll present a user-friendly account of specified complexity by means of intuitively compelling examples. Even though non-technical readers may be inclined to skip the rest of this series, I would nonetheless encourage all readers to dip into the subsequent posts, if only to persuade themselves that specified complexity has a sound rigorous basis to back up its underlying intuition. 

To Get the Ball Rolling…

Let’s consider an example by YouTube personality Dave Farina, known popularly as “Professor Dave.” In arguing against the use of small probability arguments to challenge Darwinian evolutionary theory, Farina offers the following example:

Let’s say 10 people are having a get-together, and they are curious as to what everyone’s birthday is. They go down the line. One person says June 13th, another says November 21st, and so forth. Each of them have a 1 in 365 chance of having that particular birthday. So, what is the probability that those 10 people in that room would have those 10 birthdays? Well, it’s 1 in 365 to the 10th power, or 1 in 4.2 times 10 to the 25, which is 42 trillion trillion. The odds are unthinkable, and yet there they are sitting in that room. So how can this be? Well, everyone has to have a birthday.

Farina’s use of the term “unthinkable” brings to mind Vizzini in The Princess Bride. Vizzini keeps uttering the word “inconceivable” in reaction to a man in black (Westley) steadily gaining ground on him and his henchmen. Finally, his fellow henchman Inigo Montoya remarks, “You keep using that word — I do not think it means what you think it means.”

Similarly, in contrast to Farina, an improbability of 1 in 42 trillion trillion is in fact quite thinkable. Right now you can do even better than this level of improbability. Get out a fair coin and toss it 100 times. That’ll take you a few minutes. You’ll witness an event unique in the history of coin tossing and one having a probability of 1 in 10 to the 30, or 1 in a million trillion trillion. 

The reason Farina’s improbability is quite thinkable is that the event to which it is tied is unspecified. As he puts it, “One person says June 13th, another says November 21st, and so forth.” The “and so forth” here is a giveaway that the event is unspecified. 

But now consider a variant of Farina’s example: Imagine that each of his ten people confirmed that his or her birthday was January 1. The probability would in this case again be 1 in 42 trillion trillion. But what’s different now is that the event is specified. How is it specified? It is specified in virtue of having a very short description, namely, “Everyone here was born New Year’s Day.” 

Nothing Surprising Here

The complexity in specified complexity refers to probability: the greater the complexity, the smaller the probability. There is a precise information-theoretic basis for this connection between probability and complexity that we’ll examine in the next post. Accordingly, because the joint probability of any ten birthdays is quite low, their complexity will be quite high. 

For things to get interesting with birthdays, complexity needs to be combined with specification. A specification is a salient pattern that we should not expect a highly complex event to match simply by chance. Clearly, a large group of people that all share the same birthday did not come together by chance. But what exactly is it that makes a pattern salient so that, in the presence of complexity, it becomes an instance of specified complexity and thereby defeats chance? 

That’s the whole point of specified complexity. Sheer complexity, as Farina’s example shows, cannot defeat chance. So too, the absence of complexity cannot defeat chance. For instance, if we learn that a single individual has a birthday on January 1, we wouldn’t regard anything as amiss or afoul. That event is simple, not complex, in the sense of probability. Leaving aside leap years and seasonal effects on birth rates, 1 out 365 people will on average have a birthday on January 1. With a worldwide population of 8 billion people, many people will have that birthday. 

Not by Chance

But a group of exactly 10 people all in the same room all having a birthday of January 1 is a different matter. We would not ascribe such a coincidence to chance. But why? Because the event is not just complex but also specified. And what makes a complex event also specified — or conforming to a specification — is that it has a short description. In fact, we define specifications as patterns with short descriptions.

Such a definition may seem counterintuitive, but it actually makes good sense of how we eliminate chance in practice. The fact is, any event (and by extension any object or structure produced by an event) is describable if we allow ourselves a long enough description. Any event, however improbable, can therefore be described. But most improbable events can’t be described simply. Improbable events with simple descriptions draw our attention and prod us to look for explanations other than chance.

Take Mount Rushmore. It could be described in detail as follows: for each cubic micrometer in a large cube that encloses the entire monument, register whether it contains rock or is empty of rock (treating partially filled cubic micrometers, let us stipulate, as empty). Mount Rushmore can be enclosed in a cube of under 50,000 cubic meters. Moreover, each cubic meter contains a million trillion micrometers. Accordingly, 50 billion trillion filled-or-empty cells could describe Mount Rushmore in detail. Thinking of each filled-or-empty cell as a bit then yields 50 billion trillion bits of information. That’s more information than contained in the entire World Wide Web (there are currently 2 billion websites globally). 

But of course, nobody attempts to describe Mount Rushmore that way. Instead, we describe it succinctly as “a giant rock formation that depicts the U.S. Presidents George Washington, Thomas Jefferson, Abraham Lincoln, and Theodore Roosevelt.” That’s a short description. At the same time, any rock formation the size of Mount Rushmore will be highly improbable or complex. Mount Rushmore is therefore both complex and specified. That’s why, even if we knew nothing about the history of Mount Rushmore’s construction, we would refuse to attribute it to the forces of chance (such as wind and erosion) and instead attribute it to design.

Take the Game of Poker

Consider a few more examples in this vein. There are 2,598,960 distinct possible poker hands, and so the probability of any poker hand is 1/2,598,960. Consider now two short descriptions, namely, “royal flush” and “single pair.” These descriptions have roughly the same description length. Yet there are only 4 ways of getting a royal flush and 1,098,240 ways of getting a single pair. This means the probability of getting a royal flush is 4/2,598,960 = .00000154 but the probability of getting a single pair is 1,098,240/2,598,960 = .423. A royal flush is therefore much more improbable than a single pair.

Suppose now that you are playing a game of poker and you come across these two hands, namely, a royal flush and a single pair. Which are you more apt to attribute to chance? Which are you more apt to attribute to cheating, and therefore to design? Clearly, a single pair would, by itself, not cause you to question chance. It is specified in virtue of its short description. But because it is highly probable, and therefore not complex, it would not count as an instance of specified complexity. 

Witnessing a royal flush, however, would elicit suspicion, if not an outright accusation of cheating (and therefore of design). Of course, given the sheer amount of poker played throughout the world, royal flushes will now and then appear by chance. But what raises suspicion that a given instance of a royal flush may not be the result of chance is its short description (a property it shares with “single pair”) combined with its complexity/improbability (a property it does not share with “single pair”). 

Let’s consider one further example, which seems to have become a favorite among readers of the recently released second edition of The Design Inference. In the chapter on specification, my co-author Winston Ewert and I consider a famous scene in the film The Empire Strikes Back, which we then contrast with a similar scene from another film that parodies it. Quoting from the chapter:

Darth Vader tells Luke Skywalker, “No, I am your father,” revealing himself to be Luke’s father. This is a short description of their relationship, and the relationship is surprising, at least in part because the relationship can be so briefly described. In contrast, consider the following line uttered by Dark Helmet to Lone Starr in Spaceballs, the Mel Brooks parody of Star Wars: “I am your father’s brother’s nephew’s cousin’s former room­mate.” The point of the joke is that the relationship is so compli­cated and contrived, and requires such a long description, that it evokes no suspicion and calls for no special explanation. With everybody on the planet connected by no more than “six degrees of separation,” some long description like this is bound to identify anyone.

In a universe of countless people, Darth Vader meeting Luke Skywalker is highly improbable or complex. Moreover, their relation of father to son, by being briefly described, is also specified. Their meeting therefore exhibits specified complexity and cannot be ascribed to chance. Dark Helmet meeting Lone Starr may likewise be highly improbable or complex. But given the convoluted description of their past relationship, their meeting represents an instance of unspecified complexity. If their meeting is due to design, it is for reasons other than their past relationship.

How Short Is Short Enough?

Before we move to a more formal treatment of specified complexity, we are well to ask how short is short enough for a description to count as a specification. How short should a description be so that combined with complexity it produces specified complexity? As it is, in the formal treatment of specified complexity, complexity and description length are both converted to bits, and then specified complexity can be defined as the difference of bits (the bits denoting complexity minus the bits denoting specification). 

When specified complexity is applied informally, however, we may calculate a probability (or associated complexity) but we usually don’t calculate a description length. Rather, as with the Star Wars/Spaceballs example, we make an intuitive judgment that one description is short and natural, the other long and contrived. Such intuitive judgments have, as we will see, a formal underpinning, but in practice we let ourselves be guided by intuitive specified complexity, treating it as a convincing way to distinguish merely improbable events from those that require further scrutiny.  

There is information and then there is Information?

 Shannon and Kolmogorov Information


The first edition of my book The Design Inference as well as its sequel, No Free Lunch, set the stage for defining a precise information-theoretic measure of specified complexity — which is the subject of this series. There was, however, still more work to be done to clarify the concept. In both these books, specified complexity was treated as a combination of improbability or complexity on the one hand and specification on the other. 

As presented back then, it was an oil-and-vinegar combination, with complexity and specification treated as two different types of things exhibiting no clear commonality. Neither book therefore formulated specified complexity as a unified information measure. Still, the key ideas for such a measure were in those earlier books. Here, I review those key information-theoretic ideas. In the next section, I’ll join them into a unified whole.

Let’s Start with Complexity

As noted earlier, there’s a deep connection between probability and complexity. This connection is made clear in Shannon’s theory of information. In this theory, probabilities are converted to bits. To see how this works, consider tossing a coin 100 times, which yields an event of probability 1 in 2^100 (the caret symbol here denotes exponentiation). But that number also corresponds to 100 bits of information since it takes 100 bits to characterize any sequence of 100 coin tosses (think of 1 standing for heads and 0 for tails). 

In general, any probability p corresponds to –log(p) bits of information, where the logarithm here and elsewhere in this article is to the base 2 (as needed to convert probabilities to bits). Think of a logarithm as an exponent: it’s the exponent to which you need to raise the base (here always 2) in order to get the number to which the logarithmic function is applied. Thus, for instance, a probability of p = 1/10 corresponds to an information measure of –log(1/10) ≈ 3.322 bits (or equivalently, 2^(–3.322) ≈ 1/10). Such fractional bits allow for a precise correspondence between probability and information measures.

The complexity in specified complexity is therefore Shannon information. Claude Shannon (1916–2001, pictured above) introduced this idea of information in the 1940s to understand signal transmissions (mainly of bits, but also for other character sequences) across communication channels. The longer the sequence of bits transmitted, the greater the information and therefore its complexity. 

Because of noise along any communication channel, the greater the complexity of a signal, the greater the chance of its distortion and thus the greater the need for suitable coding and error correction in transmitting the signal. So the complexity of the bit string being transmitted became an important idea within Shannon’s theory. 

Shannon’s information measure is readily extended to any event E with a probability P(E). We then define the Shannon information of E as –log(P(E)) = I(E). Note that the minus sign is there to ensure that as the probability of E goes down, the information associated with E goes up. This is as it should be. Information is invariably associated with the narrowing of possibilities. The more those possibilities are narrowed, the more the probabilities associated with those probabilities decrease, but correspondingly the more the information associated with those narrowing possibilities increases. 

For instance, consider a sequence of ten tosses of a fair coin and consider two events, E and F. Let E denote the event where the first five of these ten tosses all land heads but where we don’t know the remaining tosses. Let F denote the event where all ten tosses land heads. Clearly, F narrows down the range of possibilities for these ten tosses more than E does. Because E is only based on the first five tosses, its probability is P(E) = 2^(–5) = 1/(2^5) = 1/32. On the other hand, because F is based on all ten tosses, its probability is P(F) = 2^(–10) = 1/(2^10) = 1/1,024. In this case, the Shannon information associated with E and F is respectively I(E) = 5 bits and I(F) = 10 bits. 

We Also Need Kolmogorov Complexity

Shannon information, however, is not enough to understand specified complexity. For that, we also need Kolmogorov information, or what is also called Kolmogorov complexity. Andrei Kolmogorov (1903–1987) was the greatest probabilist of the 20th century. In the 1960s he tried to make sense of what it means for a sequence of numbers to be random. To keep things simple, and without loss of generality, we’ll focus on sequences of bits (since any numbers or characters can be represented by combinations of bits). Note that we made the same simplifying assumption for Shannon information.

The problem Kolmogorov faced was that any sequence of bits treated as the result of tossing a fair coin was equally probable. For instance, any sequence of 100 coin tosses would have probability 1/(2^100), or 100 bits of Shannon information. And yet there seemed to Kolmogorov a vast difference between the following two sequences of 100 coin tosses (letting 0 denote tails and 1 denote heads):

0000000000000000000000000
0000000000000000000000000
0000000000000000000000000
0000000000000000000000000

and

1001101111101100100010011
0001010001010010101110001
0101100000101011000100110
1100110100011000000110001

The first just repeats the same coin toss 100 times. It appears anything but random. The second, on the other hand, exhibits no salient pattern and so appears random (I got it just now from an online random bit generator). But what do we mean by random here? Is it that the one sequence is the sort we should expect to see from coin tossing but the other isn’t? But in that case, probabilities tell us nothing about how to distinguish the two sequences because they both have the same small probability of occurring. 

Ideas in the Air

Kolmogorov’s brilliant stroke was to understand the randomness of these sequences not probabilistically but computationally. Interestingly, the ideas animating Kolmogorov were in the air at that time in the mid 1960s. Thus, both Ray Solomonoff and Gregory Chaitin (then only a teenager) also came up with the same idea. Perhaps unfairly, Kolmogorov gets the lion’s share of the credit for characterizing randomness computationally. Most information-theory books (see, for instance, Cover and Thomas’s The Elements of Information Theory), in discussing this approach to randomness, will therefore focus on Kolmogorov and put it under what is called Algorithmic Information Theory (AIT). 

Briefly, Kolmogorov’s approach to randomness is to say that a sequence of bits is random to the degree that it has no short computer program that generates it. Thus, with the first sequence above, it is non-random since it has a very short program that generates it, such as a program that simply says “repeat ‘0’ 100 times.” On the other hand, there is no short program (so far as we can tell) that generates the second sequence. 

It is a combinatorial fact (i.e., a fact about the mathematics of counting or enumerating possibilities) that the vast majority of bit sequences cannot be characterized by any program shorter than the sequence itself. Obviously, any sequence can be characterized by a program that simply incorporates the entire sequence and then simply regurgitates it. But such a program fails to compress the sequence. The non-random sequences, by having programs shorter than the sequences themselves, are thus those that are compressible. The first of the sequences above is compressible. The second, for all we know, isn’t.

Kolmogorov’s information (also known as Kolmogorov complexity) is a computational theory because it focuses on identifying the shortest program that generates a given bit-string. Yet there is an irony here: it is rarely possible to say with certainly that a given bit string is truly random in the sense of having no compressible program. From combinatorics, with its mathematical counting principles, we know that the vast majority of bit sequences must be random in Kolmogorov’s sense. That’s because the number of short programs is very limited and can only generate very few longer sequences. Most longer sequences will require longer programs. 

Our Common Experience

But if for an arbitrary bit sequence D we define K(D) as the length of the shortest program that generates D, it turns out that there is no computer program that calculates K(D). Simply put, the function K is non-computable. This fact from theoretical computer science matches up with our common experience that something may seem random for a time, and yet we can never be sure that it is random because we might discover a pattern clearly showing that the thing in fact isn’t random (think of an illusion that looks like a “random” inkblot only to reveal a human face on closer inspection). 

Yet even though K is non-computable, in practice it is a useful measure, especially for understanding non-randomness. Because of its non-computability, K doesn’t help us to identify particular non-compressible sequences, these being the random sequences. Even with K as a well-defined mathematical function, we can’t in most cases determine precise values for it. Nevertheless, K does help us with the compressible sequences, in which case we may be able to estimate it even if we can’t exactly calculate it. 

What typically happens in such cases is that we find a salient pattern in a sequence, which then enables us to show that it is compressible. To that end, we need a measure of the length of bit sequences as such. Thus, for any bit sequence D, we define |D| as its length (total number of bits). Because any sequence can be defined in terms of itself, |D| forms an upper bound on Kolmogorov complexity. Suppose now that through insight or ingenuity, we find a program that substantially compresses D. The length of that program, call it n, will then be considerably less than |D| — in other words, n < |D|. 

Although this program length n will be much shorter than D, it’s typically not possible to show that this program of length n is the very shortest program that generates D. But that’s okay. Given such a program of length n, we know that K(D) cannot be greater than n because K(D) measures the very shortest such program. Thus, by finding some short program of length n, we’ll know that K(D) ≤ n < |D|. In practice, it’s enough to come up with a short program of length n that’s substantially less than |D|. The number n will then form an upper bound for K(D). In practice, we use n as an estimate for K(D). Such an estimate, as we’ll see, ends up in applications being a conservative estimate of Kolmogorov complexity.