Artificial General Intelligence: The Poverty of the Stimulus
In this series so far at Evolution News about Artificial General Intelligence, my references to AGI worshippers and idolaters will be off-putting to those who think the claim that AGI will someday arrive, whatever its ultimate ETA, is an intellectually credible and compelling position. Accordingly, I’m just being insulting by using pejorative religious language to describe AGI’s supporters, to say nothing of being a Luddite for not cheering on AGI’s ultimate triumph. I want therefore to spend some space here indicating why AGI does not deserve to be taken seriously Let’s begin with a point on which the linguist Noam Chomsky built his career, which may be encapsulated in the phrase “the poverty of the stimulus.” His point with this phrase was that humans learn language with a minimum of input, and thus must be endowed with an in-built capacity (“hardwired”) to acquire and use language. Infants see and hear adults talk and pick up language easily and naturally. It doesn’t matter if the caregivers pay special attention to the infant and provide extra stimulation so that their child can be a “baby Einstein.” It doesn’t matter if the caregivers are neglectful or even abusive. It doesn’t even matter if the child is blind, deaf, or both. Barring developmental disorders (such as some forms of autism), the child can learn language.
But It’s Not Just the Ability to Learn Language
“The poverty of the stimulus” underscores that humans do so much more with so much less than would be expected unless humans have an innate ability to learn language with minimal inputs. And it’s not just that we learn language. We gain knowledge of the world, which we express through language. Our language is especially geared to express knowledge about an external reality. This “aboutness” of the propositions we express with language is remarkable, especially on the materialist and mechanistic grounds so widely accepted by AGI’s supporters.
As G. K. Chesterton noted in his book Orthodoxy, we have on materialist grounds no right “to assert that our thoughts have any relation to reality at all.” Matter has no way to guarantee that when matter thinks (if it can think), it will tell us true things about matter. On Darwinian materialist grounds, all we need is differential reproduction and survival. A good delusion that gets us to survive and reproduce is enough. Knowledge of truth is unnecessary and perhaps even undesirable.
The philosopher Willard Quine, who was a materialist, made essentially the same point in what he called “the indeterminacy of translation.” Quine’s thesis was that translation, meaning, and reference are all indeterminate, implying that there are always valid alternative translations of a given sentence. Quine presented a thought experiment to illustrate this indeterminacy. In it, a linguist tries to determine the meaning of the word “gavagai,” uttered by a speaker of a yet-unknown language, in response to a rabbit running by. Is the speaker referring to the rabbit, the rabbit running, some rabbit part, or something unrelated to the rabbit? All of these are legitimate possibilities according to Quine and render language fundamentally indeterministic.
Yet such arguments about linguistic indeterminacy are always self-referentially incoherent. When Quine writes of indeterminacy of translation in Word and Object (1960), and thus also embraces the inscrutability of reference, he is assuming that what he is writing on these topics is properly understood one way and not another. And just to be clear, everybody is at some point in the position of a linguist because, in learning our mother tongue, we all start with a yet-unknown language. So Quine is tacitly making Chomsky’s point, which is that with minimal input — which is to say with input that underdetermines how it might be interpreted — we nevertheless have a knack for finding the right interpretation and gaining real knowledge about the world.
Chomsky’s poverty of the stimulus is regarded as controversial by some because an argument can be made that the stimuli that lead to learning, especially language learning, may in fact be adequate without having to assume a massive contribution of innate capabilities. Chomsky came up with this notion in the debate over behaviorism, which needed to characterize all human capacities as a result of stimulus-response learning. Language, according to the behaviorists, was thus characterized as verbal behavior elicited through various reinforcement schedules of rewarded and discouraged behaviors. In fact, Chomsky made a name for himself in the 1950s by reviewing B. F. Skinner’s book Verbal Behavior. That review is justly famous for demolishing behaviorist approaches to language (the field never recovered after Chomsky’s demolition).
If Chomsky Is Right
But suppose we admit that the controversy about whether the stimuli by which humans learn language has yet to be fully resolved. If Chomsky is right, those stimuli are in some sense impoverished. If his critics are right, they are adequate without needing to invoke extraordinary innate capacities. Yet if we leave aside the debate between Chomsky’s nativism and Skinner’s behaviorism, it’s nonetheless the case that such stimuli are vastly smaller in number than what artificial neural nets need to achieve human-level competence.
Consider LLMs, large language models, which are currently the rage, and of which ChatGPT is the best known and most widely used. ChatGPT4 uses 1.76 trillion parameters and its training set is based on hundreds of billions of words (perhaps a lot more, but that was the best lower-bound estimate I was able to find). Obviously, individual humans gain their language facility with nowhere near this scale of inputs. If a human child were able to process 200 words per minute and did so continuously, then by the age of ten the child would have processed 200 x 60 x 24 x 365 x 10, or roughly a billion, words. Of course, this is a vast overestimate of the child’s language exposure, ignoring sleep, repetitions, and lulls in conversation.
Or consider Tesla, which since 2015 has been promising fully autonomous vehicles as just on the horizon. Full autonomy keeps eluding the grasp of Tesla engineers, though the word on the street is that self-driving is getting better and better (as with a reported self-driving taxi in San Francisco, albeit by Waymo rather than Tesla). But consider: To aid in developing autonomous driving, Tesla processes 160 billion video frames each day from the cameras on its vehicles. This massive amount of data, used to train the neural network to achieve full self-driving, is obviously many orders of magnitude beyond what humans require to learn to drive effectively.
Erik Larson’s book The Myth of Artificial Intelligence (Harvard, 2021) is appropriately subtitled Why Computers Can’t Think the Way We Do. Whatever machines are doing when they exhibit intelligence comparable to humans, they are doing it in ways vastly different from what humans are doing. In particular, the neural networks in the news today require huge amounts of computing power and huge amounts of input data (generated, no less, from human intelligent behavior). It’s no accident that artificial intelligence’s major strides in recent years fall under Big Tech and Big Data. The “Big” here is far bigger than anything available to individual humans.
Domain Specificity
The sheer scale of efforts needed to make artificial intelligence impressive suggests human intelligence is fundamentally different from machine intelligence. But reasons to think the two are different don’t stop there. Domain specificity should raise additional doubts about the two being the same. When Elon Musk, for instance, strives to bring about fully autonomous (level 5) driving, it is by building neural nets that every week must sort through a trillion images taken from Tesla automobiles driving in real traffic under human control. Not only is the amount of data to be analyzed staggering, but it is also domain specific, focused entirely on developing self-driving automobiles.
Indeed, no one thinks that the image data being collected from Tesla automobiles and then analyzed by neural nets to facilitate full self-driving is also going to be used for automatically piloting a helicopter or helping a robot navigate a ski slope, to say nothing of playing chess or composing music. All our efforts in artificial intelligence are highly domain specific. What makes LLMs, and ChatGPT in particular, so impressive is that language is such a general instrument for expressing human intelligence. And yet, even the ability to use language in contextually relevant way based on huge troves of humanly generated data is still domain specific.
The French philosopher René Descartes, even though he saw animal bodies, including human bodies, as machines, nonetheless thought that the human mind was non-mechanical. Hence, he posited a substance dualism in which a non-material mind interacted with a material body, at the pineal gland no less. How a non-material mind could interact with a material/mechanical body Descartes left unanswered (invoking the pineal gland did nothing to resolve that problem). And yet, Descartes regarded the mind as irreducible to matter/mechanism. As he noted in his Discourse on Method (1637, pt. 5, my translation):
Although machines can do many things as well as or even better than us, they fail in other ways, thereby revealing that they do not act from knowledge but solely from the arrangement of their parts. Intelligence is a universal instrument that can meet all contingencies. Machines, on the other hand, need a specific arrangement for every specific action. In consequence, it’s impossible for machines to exhibit the diversity needed to act effectively in all the contingencies of life as our intelligence enables us to act.
Descartes was here making exactly the point of domain specificity. We can get machines to do specific things — to be wildly successful in a given, well-defined domain. Chess playing is an outstanding example, with computer chess now vastly stronger than human chess (though, interestingly, having such strong chess programs has also vastly improved the quality of human play). But chess programs play chess. They don’t also play Minecraft or Polytopia. Sure, we could create additional artificial intelligence programs that also play Minecraft and Polytopia, and then we could kludge them together with a chess playing program so that we have a single program that plays all three games. But such a kludge offers no insight into how to create an AGI that can learn to play all games, to say nothing of being a general-purpose learner, or what Descartes called “a universal instrument that can meet all contingencies.” Descartes was describing AGI. Yet artificial intelligence in its present form, even given the latest developments, is not even close.
Elon Musk Appreciates the Problem
He therefore is building Optimus, also known as the Tesla Bot. The goals is for it to become a conceptual general-purpose robotic humanoid. By having to be fully interactive with the same environments and sensory inputs as humans, such a robot could serve as a proof of concept for Descartes’s universal instrument and thus AGI. What if such a robot could understand and speak English, drive a car safely, not just play chess but learn other board games, have facial features capable of expressing what in humans would be appropriate affect, play musical instruments, create sculptures and paintings, do plumbing and electrical work, etc. That would be impressive and take us a long way toward AGI. And yet, Optimus is for now far more modest. For now, the robot is intended to be capable of performing tasks that are “unsafe, repetitive, or boring.” That is a far cry from AGI.
AGI is going to require a revolution in current artificial intelligence research, showing how to overcome domain specificity so that machines can learn novel skills and tasks for which they were not explicitly programed. And just to be clear, reinforcement learning doesn’t meet this challenge. Take AlphaZero, a program developed by DeepMind to play chess, shogi, and Go, which improved its game by playing millions of games against itself using reinforcement learning (which is to say, it rewarded winning and penalized losing). This approach allows the program to learn and improve without ongoing human intervention, leading to significant advances in computer game playing ability. But it depends on the game being neatly represented in the state of a computer, along with clear metrics for what constitutes good and bad play.
The really challenging work of current artificial intelligence research is taking the messy real world and representing it in domain-specific ways so that the artificial intelligence created can emulate humans at particular tasks. The promise of AGI is somehow to put all these disparate artificial intelligence efforts together, coming up with a unified solution to computationalize all human tasks and capacities in one fell swoop. We have not done this, are nowhere close to doing this, and have no idea of how to approach doing this.