At a recent White House press conference, a Fox News correspondent asked the Biden administration’s press secretary about AI safety researcher Eliezer Yudkowsky’s highly publicized claim that if we don’t pause or halt the development of artificial intelligence, then “literally everyone on earth will die.” The question was met with some laughter from the White House press corps. But as someone with a technical background who covers AI and talks regularly to researchers, developers, and investors in the field, I saw nothing to chuckle at.
No, I’m not an AI pessimist like Yudkowsky. Rather, I and other more optimistic AI watchers worry that overly dire warnings of imminent AI-driven destruction may cause us to pause or halt the development of a powerful technology with immense potential for improving our lives.
Insiders hold a truly wide range of opinions on the best way to approach AI—from Yudkowsky’s insistence that we immediately abandon all research in the area, to my own more moderate concern about large-scale industrial accidents arising from misuse of the technology, to an extreme optimism in some quarters about AI’s potential to turn humanity into an immortal, star-spanning species. Outsiders encountering the AI safety wars for the first time might find the sheer variety of insider positions baffling. Equally disorienting is that these debates over AI are not breaking down along well-established, “red” or “blue” culture-war lines.
The AI safety debate couldn’t have arrived at a worse time in our history. Both machine-learning researchers and our larger society are bitterly divided over what two of the discussion’s key terms—“intelligence” and “safety”—actually mean.
America’s post–George Floyd era “racial reckoning” has seen a rapid public rethink of what intelligence is and how it should or should not be measured. Colleges and professional schools are ditching standardized tests under pressure from equity advocates, who insist that these tests are slanted toward a narrow, racialized conception of intellectual competence that unfairly discounts what nonwhites have to offer universities and professional guilds.
But it’s not just our broader society that’s divided over the nature and meaning of intelligence. Researchers can’t agree on a rough working definition of this elusive concept to measure properly if or how their increasingly sophisticated machine-learning models are exhibiting more of it. Machine-learning experts offer competing definitions of “intelligence,” along with a variety of benchmarks for assessing it. Market leader OpenAI has its own, more practical definition of “artificial general intelligence”—“highly autonomous systems that outperform humans at most economically valuable work”—but even this is slippery enough to be contested.
The field long ago abandoned the famous Turing Test as a real benchmark for intelligence, and even if it hadn’t, it seems undeniable that the latest generation of large language models (LLMs) would be able to pass it handily. In fact, for want of a better set of benchmarks, OpenAI and others have been using the aforementioned standardized tests meant for humans to measure the technology’s progress. But given that GPT-4, its latest model, scores above the 90th percentile for many of these, the tests are about to lose all utility in the face of more powerful models on the horizon.
The measurement situation is so serious that OpenAI recently made open-source its suite of evaluation tools for its LLMs and has asked the public to contribute new benchmarks that might help illuminate the capabilities emerging from these models.
Definitional disputes aside, one or more of these models does seem close not only to blowing the top off of every test of cognitive capability we can formulate but also to outperforming our best mathematicians at inventing new mathematics—or our best scientists at discovering new science. The insider term for such a model would be “artificial superintelligence” (ASI). If and when that moment arrives, humanity will have encountered something completely new and outside the scope of its historical experience. Will such an entity be “smart” enough to find value in some vision of human flourishing that we’ll recognize and accept, or will it quickly conclude that our continued existence is more trouble than it’s worth? If humans had a rigorous, credible way of characterizing “superintelligence,” then we might have some hope of answering that question before booting up such a machine.
Our possible post-ASI future is the source of the extreme utopian and dystopian visions coming out of the AI safety discussion, from the optimism seen in OpenAI CEO Sam Altman’s many interviews to the doomsday worries given voice by Fox’s Peter Doocy in the aforementioned White House presser. But whether a new silicon god will bless humanity or end it isn’t the only question, or even the most urgent one, in the “safety” debates.
On a more prosaic level, the safety debates plaguing topics as diverse as gun violence, health care for transgender-identifying youth, the contents of school libraries, and campus speech codes have made their way into the AI safety wars. The results are as much of a mess as one might expect.
Some of our culture-war-coded safety fights play out directly in the domain of AI. For instance, should ChatGPT print a racial slur if the user asks for, say, the name of a certain character in Mark Twain’s Huckleberry Finn? And if the answer is “no,” then should the software feign ignorance of that character’s name, or should it admit that it knows but won’t say?
Other safety concerns can feel more urgent, but factors like model capability and user intent complicate the picture. If a user asks an LLM for a detailed explanation of how to rob a bank, should the model comply? What if the model’s answer is not at all viable as a real bank-robbery plan? What if the user is a fiction writer working on a heist novel?
Then come the big-picture safety concerns over AI’s broader impact on society. How much societal disruption from, say, AI-induced job losses is “safe,” and on what timetable? Is a modern language model’s potential to give everyone an instant writing-ability upgrade a threat because of the potential for more effective disinformation, or will it improve civic engagement by raising the quality of the arguments on all sides? Which scenario is more dangerous: giving everyone full, open-source access to the most powerful AIs, or confining these models only to a select, powerful few companies or institutions?
Historically, such questions have been worked out over the course of decades via the slow grind of legislation and litigation. But we don’t have decades. We don’t even have days.
The models presently stirring up these questions and more are already in users’ hands, and some are open-source and can be downloaded and run on a laptop or high-end gaming computer. Still more powerful models are on the near-term horizon, and not even the most plugged-in machine-learning researcher can accurately predict all the capabilities this class of software will have in just a year, much less in ten.
Even if all work on AI stopped tomorrow, researchers and programmers are still nowhere close to unlocking the full potential of the previous generation of machine-learning models, much less the current generation. This so-called “capability overhang” means significant social disruption from AI is already baked in. We will not avoid it by pressing the pause button, assuming such a button exists (and it doesn’t).
But AI progress won’t stop—at least not globally. There exists no global governance regime that could hope to get buy-in from all the world’s governments. Indeed, international cooperation and trade are headed in the opposite direction from what’d be necessary to coordinate a global pause or slowdown. So humanity will forge ahead toward whatever future awaits us at higher levels of machine-learning capability. The main safety question is simple: Who gets there first—us, or our rivals?
If the AI safety debate slows America down enough to cause us to lose this race, it will have made humanity far less safe.
Illustration by GraphicaArtis/Getty Images