Unifying Specified Complexity: Rediscovering Ancient Technology
Evolution News | @DiscoveryCSC
Evolution News | @DiscoveryCSC
Editor’s note: We have been reviewing and explaining a new article in the journal BIO-Complexity, “A Unified Model of Complex Specified Information,” by George D. Montañez. For earlier posts, see:
- “BIO-Complexity Article Offers an Objective Method for Weighing Darwinian Explanations”
- “Measuring Surprise — A Frontier of Design Theory”
Specified complexity, the property of being both unlikely and functionally specified, was introduced into the origins debate two decades ago by William Dembski by way of his book, The Design Inference. In it, he developed a theory of design detection based on observing objects that were both unlikely and matched an independently given pattern, called a specification. Dembski continued to refine his vision of specified complexity, introducing variations of his model in subsequent publications (Dembski 2001, 2002, 2005). Dembski’s independent work in specified complexity culminated with a semiotic specified complexity model (Dembski 2005), where functional specificity was measured by how succinctly a symbol-using agent could describe an object in the context of the linguistic patterns available to the agent. Objects that were complex yet could be simply described resulted in high specified complexity values.
Although Dembski’s work on specified complexity became the most widely known, bioinformatics specialist Aleksandar Milosavljević appears to have developed the first fully mathematical specified complexity model with his algorithmic significance method (Milosavljević 1993, 1995). Milosavljević presented his work in the early 1990s, which by tech standards, is ancient times. His specified complexity model used algorithmic information theory to test independence between DNA sequences based on the improbability of encountering a sequence under some probability distribution and the length of the sequence’s compressed encoding relative to a second sequence. A similar method of measuring specified complexity was later independently rediscovered (as great ideas often are) by Ewert, Marks, and Dembski with their algorithmic specified complexity model (Ewert, Dembski, and Marks II 2012, 2015).
Given Milosavljević’s early work with algorithmic significance, mathematical specified complexity models have successfully been used in fields outside of intelligent design for a quarter of a century. A new paper, published in the open-access journal BIO-Complexity, aims to push forward the development of specified complexity methods by developing a detailed mathematical theory of complex specified information.
Unified Models of Specified Complexity
In “A Unified Model of Complex Specified Information,” George D. Montañez introduces a new framework that brings together various specified complexity models by uncovering a shared mathematical identity between them. This shared identity, called the common form, consists of three main components, combined into what is called a kardis function. The components are:
a probability term, p(x),
a specification term, ν(x), and
a scaling constant, r.
For an object x, the first of these gives a sense of how likely the object is to be generated by some probabilistic process modeled by p. When this value is low, the object is not one that is often generated by the process. The specification term, ν(x), captures to what degree x conforms to an independently given specification, modeled as a nonnegative function over the (typically restricted) space of possible objects. When this value is large, the object is considered highly specified. Lastly, the scaling constant r (also called the “replicational resources”) can be interpreted as a normalization factor for the specification values (rescaling the values to some predefined range) or as the number of “attempts” the probabilistic process is given to generate the object in question. (The paper discusses in detail both interpretations of the scaling constant.) Given these components, the kardis function κ(x) is defined as
κ(x) = r [p(x) / ν(x)].
Taking the negative log, base-2, of κ(x) defines the common form for specified complexity models.
Common Form Models
The paper presents Dembski’s semiotic specified complexity and Ewert et. al’s algorithmic specified complexity as common form models, mapping the parts of each model to kardis components. This mapping is done for additional specified complexity models, as well.
Dembski’s semiotic model contains three core components (a probability term P(T|H), specification term φS(T), and scaling constant 10120), which can be mapped to kardis components as p(x) = P(T|H), ν(x) = φS(T)-1, and r = 10120. Dembski defines his specified complexity as
χ = -log2[10120φS(T)P(T|H)] = -log2κ(x),
which we see is a common form model with x = T.
Similarly, Ewert et al.’s algorithmic specified complexity contains a probability term p(x), a specification term ν(x) = 2–K(x|c), and an implicit scaling term r = 1, making it a common form model.
Lastly, Milosavljević’s algorithmic significance model is also of common form, with a kardis containing probability term p0(x), specification term 2-IA(x|s), and implicit scaling constant r = 1. Through this mapping, the connection to algorithmic specified complexity becomes clear, and the model’s status as a form of specified complexity becomes indisputable.
Canonical Specified Complexity
For any common form model, adding a constraint that r is at least as large as the sum (or integral) of specification values over the entire domain of ν, we obtain a canonical specified complexity model. The paper primarily works with canonical models, proving theorems related to them, although some results are also given for simple common form models. Tweaks to some common form models (such as Demsbki’s semiotic model and Hazen et al.’s functional information model) allow them to become canonical model variants, to which the theorems derived in the paper apply. Canonical models represent a subset of common form models, and have several interesting properties. These properties include the scarcity of large values, such that under any fixed random or semi-random process the probability of observing large values is strictly bounded (and exponentially small, when large value observations are desired). The paper gives further detail, for those interested.
The Power of a Good Abstraction
What does it mean for existing specified complexity models to all share a single underlying form? First, it allows us to reason about many specified complexity models simultaneously and prove theorems for them collectively. It allows us to better understand each model, since we can relate it to other specified complexity models. Second, it hints strongly that any attempt to solve the problem of measuring anomalous events will converge on a similar solution, increasing our confidence that the common form represents the solution to the problem. Third, we can build from a simplified framework, clearing away incidental details to focus on the behavior of specified complexity models at their core essence.
Finally, having discovered the common form parameterization, we can establish that Milosavljević’s algorithmic significance model is not just like a specified complexity model, but is a specified complexity model, definitively refuting claims that specified complexity methods have no practical value, are unworkable, or have not been used in applied fields like machine learning or bioinformatics. We have now come to discover that they’ve been in use for at least 25 years. Milosavljević couldn’t access the vocabulary of common forms and canonical models, so what he saw as the difference between a surprisal term and its compressed relative encoding, we now more clearly see as a compression-based canonical specified complexity model.
Symbols in Steel and Stone
Returning to your winter retreat, mentioned in the last post, the symbols you discovered remain on your mind. A portion of the symbols, those on the metal pieces, you’ve been able to map to numbers coded in a base-7 number system. Your conviction is strengthened once you realize the numbers include a sequence of prime digits, spanning from 2 to 31. You imagine that some ancient mathematician etched the symbols into the metal, someone that either had knowledge of the primes or did not. If they did not, there would be some probability p(x) that they produced the sequence xwithout intention.
Given that the sequence also matches an independent pattern (primes), you ask yourself, how many sequences using the first thirty-one positive integers would match any recognizable numeric pattern, of which the primes are just one example? The on-line encyclopedia of integer sequences has an estimated 300,000 sequences which could serve as a pattern (to someone more knowledgeable than yourself). You imagine that perhaps this number underestimates the number of interesting patterns, so you double it to be safe, and assume 600,000 possible matchable patterns, of which the prime sequence is just one instance.
You’ve spent time studying the manuscript on specified complexity you brought along with you, and are eager to understand your discovery in light of the framework it presents, that of canonical specified complexity. You let the space of possible sequences be all the 3111 sequences of length 11 using the first 31 positive integers. You let ν(x) equal to one whenever sequence x exists in the OEIS repository (representing an “interesting” number pattern), and upper bound r by 600,000, the number of interesting patterns that possible sequences could match. You know these are rough estimates which will undoubtedly need to be revised in the future, but you’d like to get a notion for just how anomalous the sequence you’ve discovered actually is. Your instinct tells you “very,” but mapping your quantities to a canonical model kardis gives you the first step along a more objective path, and you turn to the paper to see what you can infer about the origin of your sequence based on your model. You have much work ahead, but after many more hours of study and reflection, the darkening night compels you to set aside your workbook and get some rest.
Bibliography
Dembski, William A. 2001. “Detecting Design by Eliminating Chance: A Response to Robin Collins.” Christian Scholar’s Review 30 (3): 343–58.
———. 2002. No Free Lunch: Why Specified Complexity Cannot Be Purchased Without Intelligence. Lanham: Rowman & Littlefield.
———. 2005. “Specification: The Pattern That Signifies Intelligence.” Philosophia Christi 7 (2): 299–343. https://doi.org/10.5840/pc20057230.
Ewert, Winston, William A Dembski, and Robert J Marks II. 2012. “Algorithmic Specified Complexity.” Engineering and Metaphysics. https://doi.org/10.33014/isbn.0975283863.7.
———. 2015. “Algorithmic Specified Complexity in the Game of Life.” IEEE Transactions on Systems, Man, and Cybernetics: Systems 45 (4): 584–94. https://doi.org/10.1109/TSMC.2014.2331917.
Milosavljević, Aleksandar. 1993. “Discovering Sequence Similarity by the Algorithmic Significance Method.” Proc Int Conf Intell Syst Mol Biol 1: 284–91.
———. 1995. “Discovering Dependencies via Algorithmic Mutual Information: A Case Study in DNA Sequence Comparisons.” Machine Learning 21 (1-2): 35–50. https://doi.org/10.1007/BF00993378.