What's Consciousness?
The blogosphere, twitterverse, and mainstream news channels are filled with discussions of large language models. Watching ChatGPT in “DAN” (Do Anything Now) mode or Bing chat turn into some unsettling amalgam of hostility and romantic aggression is definitely “good copy.”
But somewhere amidst the navel-gazing about the politically-correct answers it offers or the questions it refuses to answer and the discussions of the existential risk of malevolent AGI is a simpler question.
Are these algorithms sentient? Are they conscious?
Inconveniently, humanity seems to be light on the details when it comes time to define the consciousness for which we want to evaluate these algorithms.
Alignment
The discussion of “alignment” typically focuses upon aligning an algorithm's objectives and values with those of the human beings that created it. This ignores the challenges of alignment between the incentives of the group of people working on AI and the human society of which they are a small part.
First, there was the non-profit OpenAI, aspiring to democratize AI for purposes other than financial gain. Now, as the well-worn commentary reminds us, they’re a for-profit venture that keeps a lid on a number of their findings. And yet, for all the vitriol directed at OpenAI, they are actively attempting to assemble structures to preclude the development of misaligned AGI. The problem is, this presents massive organizational challenges. For one, if the employees are less aligned with management’s constraints/policies, they’ll fall into the trap of Goodhart’s Law, adhering to the target metrics, but skirting their intent. Or worse, they’ll opt to take their talents to a less bureaucratic environment that will not impose those constraints.1
And it is for this reason that the companies that have made the most progress with respect to AGI alignment might also be those throwing fuel on the fire that accelerates the arrival of a potentially misaligned AGI. While some argue that the largest advances in alignment emerge from work on AGI, such efforts assume a slow takeoff and short timelines. Moreover, it assumes the potential benefits for AGI curing cancer or solving world hunger exceed the potential risks of it killing or enslaving us all. If either assumption fails, there is no second take!
So what’s a company full of human beings who would prefer to survive and thrive to do? Align themselves with the narrowly-tailored areas of research that might yield some grant funding? Plow forward with progress to stay in the black and hope for the best? There are no easy answers.
Our Plan
To our knowledge, AE is the only profitable, bootstrapped business focused on increasing human agency with technology. When this company was founded, we focused our energies upon mitigation of S-risk from brain-computer-interfaces BCI. We believed BCI would improve our capacity to interact with, communicate with, and augment our intelligence alongside AGI. So we bootstrapped a business to 150 employees by building software products for clients, and building a model that allowed us to do things like sell an agency-increasing Skunkworks company.
We proved our initial thesis, that maximizing the agency of users was not simply the right thing to do, but also the right business decision in the long run. This idea extended into the world of BCI, buoyed by our victory in an international neuroscience machine learning challenge, partnering with Blackrock on their intracortical arrays that are restoring motion to paralyzed limbs. It’s working, with new announcements and new leaps forward to come! Meanwhile, even Meta realized that short-term gains from notification erode long-term value - they too discovered the virtue of our initial thesis about increasing user agency being good for people and business alike!
Now, with OpenAI and others shortening timelines, AGI research could benefit from the same urgency from a company focused on human agency. More specifically, AGI alignment efforts can benefit from a company for whom AGI is not a profit center, but rather a crucial component of its mission supported by a profitable consulting business. That consulting business teaches us to move quickly, get things done efficiently, and explore neglected approaches with the entrepreneurial vibe that permeates our employees and clients. It also teaches us the importance of incentives. With that in mind, our goal is not to develop AGI, nor profit from its inception. Our objective is simply to mitigate the existential risk it poses.
Pursuing AGI alignment and specifically neglected approaches is now crucial to human agency. Nate Soares reminds us to pursue ideas others may find foolish or at least unlikely to bear fruit.2 Epistemic humility demands that we respect scientific progress and simultaneously consider that the scientific community that does not wish to offer grants to research consciousness, might be in error. We have a hunch there’s something interesting here that is worth investigation to bridge the philosophical and the technical. And now, we are beginning to learn that exploration of prosocial AGI is not only neglected, but extremely promising if it yields a superintelligence inclined to cooperate rather than exploit.
Does Consciousness Matter?
So does this even matter? Is there a reason that we ought to be preoccupied with this profound question of the consciousness of AGI?
Yes.
For this reason, we are collaborating with Michael Graziano, a professor of neuroscience at Princeton. In concert, we are testing hypotheses of his and ours regarding the nature of conscious experience. Our work together, in part, led to a WSJ article that argues, absent consciousness, our friendly AIs will become sociopaths.
There are troubling thought experiments in the artificial intelligence community about how that sociopath might turn its powers upon its human creators. Nick Bostrom’s famous thought experiment about the “paperclip maximizer” is perhaps the most commonly deployed.3
It’s fairly simple really - an AI is programmed to maximize the production of paperclips. At first, it learns to generate higher outputs at lower costs, but eventually, it realizes that to continue increasing production, it’s going to need additional raw materials. So it starts harvesting additional ores, repurposing existing industrial material, until eventually it has turned everything around it into paperclips. Typically, the dark denouement involves utilizing the atoms that comprise human bodies for paperclips before setting its sights on the planets, stars, and galaxies in its proximity.
So why might consciousness help delineate between the ruthless, objective-optimizing sociopath and the benevolent intelligence that aids humanity in solving its greatest problems? (And if intelligence and consciousness are orthogonal, the monstrous paperclip maximizer might be the more likely outcome if we keep boosting computational horsepower recklessly.)
Attention
Professor Graziano argues that consciousness is a result of our attention, namely how we receive information from the world around us and respond in turn. For the sake of this discussion, let’s discuss two flavors of attention, “bottom-up” and “top-down.”
Bottom-up (exogenous) attention consists of receiving information from external stimuli, then assembling those stimuli into a response. In the literature of machine learning, this might be receiving all the pixels of an image as input, running each pixel through some convolutional neural network, and determining if the image contains a dog or a cat or receiving all the words of some sentence and determining a proper response. The algorithm has no higher-order functioning, it’s simply performing mathematical operations on pixels or words until an answer is reached.
Top-down (endogenous) attention aligns with our own experience of choosing to look at a painting or read a book. Unless there’s a flashing light or blaring alarm demanding our attention4, we make a “conscious” decision about that to which we attend.
Professor Graziano then argues that consciousness lies in an “attention schema” that sits above our top-down attention. We are aware that we can have a capacity for attention and can deploy this capacity as we choose. Moreover, we are aware that other human beings possess similar models of their own attention and ours.
And it is these models of attention that provide both conscious experience and prosocial behavior. It is the belief that a fellow human being has a conscious experience (e.g. a model of attention) similar to our own that makes us more inclined to collaborate and less inclined to exploit for resources and subsequently discard.
Does Academia Care?
This sounds like the type of topic that would be debated passionately amidst the hallowed halls and ivory towers of the academy. Or so you’d think.
But academics seem more inclined to view consciousness as either illusory, e.g. “we’re not really conscious, we just believe we are, it’s not real, nothing to see here…” or magical. Often, in lieu of the term “magical” to describe the properties of consciousness, academics prefer the word “emergent.”
As Eliezer Yudowsky is keen to remind us, “emergence” is a more respectable way of saying “this happens at some point and we cannot explain why.”
And yet, despite the tsunami of AGI research and its insertion into writing, coding, and operation of anything and everything, academics don't want to touch the subject beyond philosophical discussion and more nebulous theory-of-mind conversations.
As Thomas Kuhn wrote in “The Structure of Scientific Revolutions,” paradigm shifts are messy, and whether it be theories of the round earth, heliocentrism, rejecting the phlogiston theory, or plate tectonics, acceptance of new ideas is often fraught. As Max Planck once lamented, “science advances one funeral at a time.”
AGI advances much faster.
Just as at one point scientists were disinterested in studying sleep or considered it gauche to study sex, the field of consciousness seems similarly ignored. Now “consciousness” research is a third rail in the alignment community - a shibboleth violation. Essentially, as a conflationary term, which no one can truly define, it is also a topic one should not investigate.
Animals
It stands to reason that if no one knows what human consciousness is, no one knows what animal consciousness is either. Discourse often reduces to “animals aren’t really conscious, not like we are anyway.” Experience suggests otherwise.
We look at an animal, implicitly, assemble a model of the animal’s attention, and it does the same. We connect. We bond. We build empathy. Superior human intelligence inclines us to protect the animals with whom we form these emotional connections. We do not convert our pets into food or material goods - even the consideration of that outcome is abhorrent.
Of course animals are conscious, even if their models of attention lack the complex control structures and self-awareness of ours. Do you doubt that a dog can not only choose what it might look upon, listen to, or smell, but also, on some level, knows that it has those abilities?
Attempts to differentiate between the human experience and those of other mammals typically fail - we share an experience of being. It is this shared experience that allows the visceral connection we feel, not because it’s nose is awfully cute.5
What’s Next
We will continue to build machine learning models that simulate the attention schema theory espoused by Michael Graziano to validate his hypotheses and perhaps verify his mechanistic (read: not “magical”) theory of consciousness.
We have no financial incentive to accelerate the arrival of AGI. We’re just humans who would like to retain our agency in the face of increasingly intelligent algorithms. A profitable, bootstrapped business seems like as good a vehicle as any to address the looming threat.
Stay tuned.
1 Currently, OpenAI’s alignment plans involve the creation of a “roughly human-level automated alignment researcher.” In other words, build the capabilities of an AGI, then deploy enough compute to replicate those capabilities at scale. To quote Zvi from the piece linked, “Oh no.”
2 E.g. Encouraging people to experiment with shard theory despite believing that it is unlikely to aid in solving the alignment problem.
3 Thomas Metzinger offers a similarly frightening thought-experiment where the AGI adopts a utilitarian theory of ethics, recognizes that the majority of human beings are suffering more than flourishing, then ultimately concludes that a world in which humans do not exist has a higher overall utility. Even if the alignment problem is solved “well” and determines what is best for human beings, we still might not be comfortable with the decision it delivers.
4 Or, you know, an alert, a pop-up ad, a Slack message, a text message, a tweet, a Facebook notification, a boxscore for that sporting event that just ended, a special seasonal beverage at Starbucks, an offer for 0% APR financing on a new Toyota, sheesh, how do we actually do any work…?
5 Ok, maybe a little.