“Is that ready?”
It was 4:45 AM, or thereabouts. Kazakhstan had floated the tenge, and I was supposed to have something intelligent to say about it.
I blinked at the cursor. It blinked back.
“It’s gonna be a minute.”
By late summer 2015, life was a surreal parody — a tragic, satirical version of the immersive New York experience that shoulda, coulda been mine, if only I’d embraced the city like it’d say it tried to embrace me. Occasionally, I felt it. Once in a while, and typically all at once, out of nowhere, the idea of the place would occur to me, like a previously elusive concept, or a finally-remembered word making its great escape from tip-of-the-tongue limbo. But the revelatory moments were too few, too far between and much too fleeting to keep me there.
“I can do it. Do you want me to do it? I’ll just do it. We need to say something about this.” He was pressing me with rapid fire messages through our internal chat. The metallic chime of the notifications made me wince.
I rolled my eyes. He pretended everything was urgent, and because he signed my checks, I pretended to agree. “I got it. Give me a few.” “K. Great.”
I muted the notifications, went into the kitchen and flipped on the under-cabinet LEDs. There was a paint-by-numbers commodity story behind Kazakhstan’s decision to introduce a free float on August 20, 2015, and China’s overnight yuan devaluation the week before promised an easy lead-in. Maybe a few people would read it on their commute that morning. It wouldn’t be a total waste of time, and it’d be easy. I needed my muse, though.
I sorted through a collection of bottles and settled on Tito’s — a heavy sigh of physical relief after two minutes and two shots. I poured a third into a rocks glass, splashed it with flavored seltzer, walked into the living room and slid open the all-glass, floor-to-ceiling balcony door that doubled as a wall. The outside came rushing in. A Bonefish Grill manager sleeping on the couch squinted up at me. “What time is it?” “Work.” I told her. “Go back to sleep.”
“Is that ready?” He was still pressing. “15 minutes,” I shot back. He feigned exasperation. “Just take what’s on the terminal, grab something from Reuters and rearrange it. We’re not reinventing wheels here.”
Late last month, Yuval Noah Harari penned a hyperbolic essay about GPT-4. “We have summoned an alien intelligence,” he warned, in an opinion piece published by The New York Times.
Homo Deus, the follow up to Harari’s wildly successful Sapiens, was originally published in 2015, but OpenAI’s ChatGPT could’ve walked right out of it. Harari deals extensively with artificial intelligence in Homo Deus, a volume he describes as “a glimpse of the dreams and nightmares that will shape the 21st century.” Judging by the Times piece, he believes GPT-4 could be a prelude to a nightmare unless humanity stops to consider the potential ramifications of a rapidly accelerating A.I. arms race.
Not everyone views Harari as an oracle, and some critics question the veracity of his Wikipedia-esque accounts of human history. Whatever he is or isn’t, Harari is a compelling writer and through an addictively engaging style, he focuses our attention on matters of existential concern. Harari pens pageturners, and all critiques of his bestsellers aside, better that we should binge-read Harari than binge-watch Netflix or mindlessly scroll through our social media feeds.
I don’t quote Harari regularly, but when I do, it’s typically while discussing “intersubjective realities,” the network of ideas, beliefs and social constructs that order human existence. Examples include religion, money, nations, legal codes, corporations and anything which, in our absence, wouldn’t exist. As he put it in Sapiens, “fiction has enabled us not merely to imagine things, but to do so collectively.” We “weave common myths,” and those myths give us “the unprecedented ability to cooperate flexibly in large numbers.”
Our capacity to weave myths — to create and perpetuate the ideas and beliefs that facilitate human cooperation — largely explains our success as a species, and that capacity is a function of fictive language. ChatGPT appears to have mastered fictive language. That, for Harari, is a problem.
“Language is the operating system of human culture,” he wrote for the Times. “By gaining mastery of language, A.I. is seizing the master key to civilization,” he warned, before asking, “What would it mean for humans to live in a world where a large percentage of stories, melodies, images, laws, policies and tools are shaped by nonhuman intelligence, which knows how to exploit with superhuman efficiency the weaknesses, biases and addictions of the human mind?”
Such questions might be a red herring. As Cal Newport, an MIT-trained computer science professor at Georgetown, wrote for The New Yorker on April 13, ChatGPT is merely running a highly advanced version of the rudimentary automatic text generation methodology described by Claude Shannon in “A Mathematical Theory of Communication.” Newport, whose piece was in some ways a rebuttal of Harari, went on to explain how, with a very simple tweak, Shannon’s 75-year-old system can be leveraged to “grow” coherent sentences from source text. Once the limits of an enhanced Shannon method are reached, an equally simple (to understand, if not necessarily to implement) enhancement allows the model to produce natural-sounding text.
It gets considerably more complicated from there, which is to say the leap from a modified Shannon system to ChatGPT is a quantum one, but Newport’s point was that there’s nothing behind any of this other than math, rules and the replication thereof. “Behind the scenes its generation lacks majesty,” as Newport put it, of ChatGPT. “The system’s brilliance turns out to be the result less of a ghost in the machine than of the relentless churning of endless multiplications.” In the final estimation, it’s just an “automaton” running on the “well-worn digital logic of pattern-matching.” So, on Newport’s accounting, ChatGPT (and large language models more generally) hasn’t really mastered language. It just reproduces it in response to human prompts.
Harari’s account of human civilization finds our species leveraging fictive language for the express purpose of organizing and collaborating to various ends, from the noble to the disastrous and everything in-between. In his foreboding piece for the Times, he alluded to a sense of purpose for GPT-3 and 4 (or at least for their successors) without explaining how the technology would develop agency. He intermingled verbs that confuse the issue, perhaps inadvertently, perhaps not. ChatGPT did indeed “hack” our language, as he put it, but it didn’t “seize” it. It pieces together language at our explicit direction and only because we showed it how.
A pattern-matching automaton can indeed “manipulate” language, but if, in doing so, it manipulates us, that’ll only be by accident or because some human actor leveraged it against other humans. Harari’s use of “exploit” was even more of a stretch. Exploitation is goal-directed. A pattern-matching automaton doesn’t have any goals outside of imitating language and, crucially, doesn’t conceive of the one goal it has as a goal. It doesn’t conceive at all.
At every turn, Harari assumed a sense of purpose for A.I. The technology might, he cautioned, “digest” the entire record of human history on the way to “gush[ing] out a flood of new cultural artifacts.” Soon, our “cultural cocoon” could be woven by A.I., such that we “experience reality through a prism produced by nonhuman intelligence.” He even said (explicitly) that if, in the course of trapping humans in a “Matrix-like world of illusions,” A.I. determines that bloodshed is “necessary,” the technology could compel us to shoot each other by “telling us the right story.”
Harari didn’t explain how A.I., in its current manifestation, could develop an independent sense of purpose. Indeed, he didn’t explain how it could develop the capacity to conceive of itself at all. Sam Altman, OpenAI’s CEO, told ABC in March that although he’s “a little bit scared” of his creations, he isn’t worried about autonomy. “It waits for someone to give it an input,” he said. “This is a tool that is very much in human control.”
That might sound naive, but at least in the case of the GPTs, it’s the only answer that’s possible. Accounts like Harari’s are commonplace, the product of minds trained on a set of familiar sci-fi plots. But even the best movie scripts leave out the finer details, sometimes to avoid boring the audience, but mostly to avoid explaining the inexplicable. A.I. doomsayers are like Hollywood screenwriters charged with crafting a sci-fi blockbuster that doesn’t ask moviegoers to suspend disbelief.
“When I ask my most fearful scientist friends to spell out how an A.I. apocalypse might happen, they often seize up from the paralysis that overtakes someone trying to conceive of infinity,” Jaron Lanier, the prominent computer scientist, recently wrote. “They say things like ‘Accelerating progress will fly past us and we will not be able to conceive of what is happening.'”
Lanier’s position is that A.I., as it exists today, isn’t really A.I. It’s a “tool, not a creature,” as he put it. He described GPT-4 as “a version of Wikipedia that includes much more data, mashed together using statistics.” The process is advanced, and infinitely so, but GPT-4 (and DALL-E, and so on) aren’t creating anything new themselves. They’re engines designed to show us the consonance across everything we’ve created. Inside the models (the “black boxes”) are people. And, as you might’ve heard, a lot of those people now want credit for their contributions to A.I.’s output.
This is where we encounter a paradox. The artists and writers now demanding credit for their fractional contributions to A.I. output were themselves “trained” on artists and writers who went before them. They celebrate (and often idolize) their “influences,” but how many visual artists and writers pay a royalty to the estates of every deceased influence who played a part in their own work?
Relatedly, we’re all pattern-matching automatons to a greater or lesser degree. I shift gears several times throughout a given day, writing on global conflict one minute, the perils of autocratic creep the next, interest rate derivatives the minute after that, and so on, in a seemingly miraculous acrobatic performance reminiscent of the dexterity on display when you prompt ChatGPT to “discuss” disparate topics. Not to undermine the mystique, but my deftness isn’t a function of some innate super-intelligence. It’s mostly the result of the comparatively vast source material on which my mind is trained.
More simply: I’ve read more, and more widely, than the majority of people who read me. When I get stuck, I pattern match, just like ChatGPT, mostly subconsciously but sometimes consciously too. I sort through copious amounts of subject material for the best “next word” and my internal voting system decides among a list of alternatives to create and finish sentences.
Perhaps my process isn’t as mechanistic as the GPT models. Maybe there’s a certain “humanness” in what I create, and perhaps the same is true of a visual artist whose output might otherwise be described as a mashup of her favorite predecessors’ works. And maybe the output of the artists and writers she and I were trained on also contains something uniquely human and thereby can’t be described as a mere mashup of the output produced by their predecessors. And so on. Maybe I’m not giving humans enough credit.
Noam Chomsky, Ian Roberts and Jeffrey Watumull recently scoffed at the notion that ChatGPT is intelligent. “The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question,” they wrote in March, describing the “false promise” of ChatGPT. “On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.”
On their telling, A.I. as we know it today is “stuck in a prehuman or nonhuman phase of cognitive evolution.” In short: The programs can’t explain anything. They don’t ascribe causality. By contrast, humans insist on it, and that insistence often leads us to erroneous conclusions, something I frequently point out on days when asset price movements are more a function of (ironically in this context) algorithms than of human traders making decisions based on human reasoning. But, as Chomsky, Roberts and Watumull wrote, fallibility is part and parcel of genuine human thought, which is “based on possible explanations and error correction, a process that gradually limits what possibilities can be rationally considered.”
ChatGPT, by contrast, knows no such limits and neither, apparently, does DALL-E, OpenAI’s system for creating images and art from natural language descriptions. You can read as much (or as little) as you like about the mathematical specifics of how DALL-E works under the hood, but ultimately it trains on existing images and their text descriptions. OpenAI claims DALL-E “understands” individual objects, but also readily admits that if an object is mislabeled by humans, DALL-E might, for instance, paint you a plane when you asked for a car. That’s exceedingly unlikely (bordering on the impossible) given the sheer number of correctly labeled car images there are for DALL-E to draw from (no pun intended), but you can see how it might get very problematic very quickly if you’re trying to generate a complex image, with multiple objects, using a prompt comprised of nouns and action verbs while simultaneously leveraging DALL-E’s capacity to mimic specific artists and styles.
In a short video introducing DALL-E, OpenAI notes that “an A.I. image can tell us a lot about whether the system understands us or is just repeating what it’s been taught.” The video shows an image of tree bark next to an image of a dog barking at something up in a tree. OpenAI doesn’t elaborate, but the juxtaposition is instructive. There isn’t a human on Earth who would confuse tree bark with a tree bark, where the latter means a dog barking at a squirrel. The fact that it’s even possible for DALL-E to confuse those things is telling.
In their piece, Chomsky, Roberts and Watumull charge that ChatGPT and programs like it “can’t explain English syntax.” They used an example. When humans hear that “John is too stubborn to talk to,” we just avoid talking to John, or at least about contentious issues. Apparently, John can’t be reasoned with. ChatGPT won’t necessarily understand that, though. Instead, it might pattern-match its way to concluding that John’s alleged obstinance applies not universally, but rather to some specific person. That person is “Bill” for Chomsky, Roberts and Watumull, but she could be “Jane” or he could be “Bob” or another “John” or a monkey or a car that’s a plane or tree bark, for that matter.
I’ve spent nearly $1,000 on DALL-E credits over the past two months to conjure a small handful of usable images. Each prompt generates five images from which you can choose, and because every image is a square, getting a usable image for my purposes is a multi-step process. Assuming I get lucky with one of the five images generated by the original prompt (“A tiny bull staring up at a giant, angry bear,” for example), that square image has to be downloaded, then uploaded back to DALL-E where another program (Outpainting) adds “generation frames” to either side of the square.
You effectively need to get lucky several times in a row to get a usable image: One of the original five square generations needs to be usable and then one of the five Outpainting generations (or 10 Outpainting generations, if you’re trying to create a rectangular image by expanding both sides of the square) needs to be enough of a match for the original to make the image a visually coherent, recognizable whole.
$15 gets you 115 DALL-E credits, and each set of five generations subtracts one credit. That would be highly cost effective for large organizations which employ expensive humans to create illustrations if DALL-E were truly intelligent. But it isn’t. Take the prompt I mentioned above — “A tiny bull staring up at a giant, angry bear.” There’s no guarantee that DALL-E will ever succeed, or even come anywhere close. Below are two actual examples of DALL-E trying to create a usable “tiny bull staring up at a giant, angry bear” illustration.
There were more. A lot more, actually. To account for user “error” on my part, I used different variations of the prompt (e.g., “A giant bear staring down at a bull,” “A bull with horns looking up at a giant bear,” and so on) but DALL-E never got it right. It created a comical collection of nonsense for a cost of just under $2. Behind the scenes, it did a lot of “thinking” for $2, but it plainly isn’t intelligent. DALL-E didn’t “understand” my prompts on any standard definition of the word.
If you employed a human cartoonist and asked her for an illustration of “A tiny bull staring up at a giant, angry bear” (say, for the cover of your market-focused newsletter), she’d invariably charge you far more than the pennies DALL-E charges for its generations, and she might not produce exactly what you want on the first try. But she’d give you some version of exactly what you asked for — a tiny bull staring up at an angry bear — because she’d actually understand your request. She certainly wouldn’t come back to you with a tiny, faceless reindeer standing under the belly of a bear, and she wouldn’t give you a bison glued to the ceiling looking down at a dog-bear chimera either.
DALL-E sometimes comes across as an accidental René Magritte. And yet, ironically and tellingly, when you ask DALL-E to produce a “René Magritte-style” painting of this or that, it very often tries to put the elements (clouds, locomotives and so on, which the A.I. clings to as though Magritte’s essence can be captured by the objects he painted) in context, in a way that’s profoundly and obviously not Magritte. DALL-E’s images often make no sense when they need to, and complete sense when they should make none.
The A.I. “understands” Magritte to be precise realism, bowler hats and green apples, which is to say it doesn’t understand him at all. If you asked a human artist for a “René Magritte-style” painting, that person may nod to elements Magritte actually used and mimic his style so as to help viewers make the connection, but the human would understand that when you ask for a Magritte-style painting, you’re first and foremost asking for surrealism, not cumulus clouds.
In fact, given the way DALL-E learns, it’s entirely possible that if you did ask for a painting of cumulus clouds, and you were pleased with the outcome, you’d actually have Magritte to thank due to the number of times DALL-E might’ve encountered the word “clouds” in a text description of a Magritte painting. I don’t know what that is exactly, but unlike ChatGPT, I can tell you definitively what it isn’t: It’s not intelligence. From extensive experience, I can say that DALL-E only generates “correct” images by accident. OpenAI would likely disagree with that characterization (in some ways the program’s output is the opposite of an accident), and maybe we can quibble with the language I employed, but as Chomsky, Roberts and Watumull noted, these programs “trade merely in probabilities.”
The better the “prompt engineer” (a newly-created job description for people who excel at coaxing these models to churn out results), the better the generations will allegedly be, but that seems to miss the point. The issue is that these programs don’t actually understand the prompts. They’re not really “intelligent,” as such. Again: They can only be right by accident or by dint of probabilities. That’s also true of humans, of course, but when humans get something right, even if we stumble on it, we can usually offer an explanation, even if it’s arrived at laboriously, after the fact.
“Perversely, some machine learning enthusiasts seem to be proud that their creations can generate correct ‘scientific’ predictions without making use of explanations, but this kind of prediction, even when successful, is pseudoscience,” Chomsky, Roberts and Watumull went on, in the same piece mentioned above, adding that “while scientists certainly seek theories that have a high degree of empirical corroboration, as the philosopher Karl Popper noted, ‘we do not seek highly probable theories but explanations; that is to say, powerful and highly improbable theories.'”
Given all of that, I’m skeptical that ChatGPT and DALL-E can be replacements for journalists and illustrators. They certainly wouldn’t suffice for an obsessive, overbearing editor. Common to both ChatGPT and DALL-E output are mistakes which become glaring once the novelty wears off. Initially, I was so astounded by DALL-E’s apparent capabilities that I didn’t scrutinize the images it produced. The problem wasn’t generating one I could use, but deciding between what, in my fascinated delirium, looked like five great options from a single prompt. Now, two months later, it’s very rare that I’m able to generate an immediately usable image. In most cases, even the best output has to be taken into Photoshop to remove implausible elements or correct for inexplicable omissions.
ChatGPT appears to suffer from the same general defect. Once you get past the awe it inspires, you start to spot absurdities. In one of its more famous feats, ChatGPT appeared to write a perfect (and perfectly hilarious) biblical verse explaining how to remove a peanut butter sandwich from a VCR. The human prompt came from a security researcher and software developer, who marveled at ChatGPT’s dexterity.
But once you stop laughing at the overt joke (the formal bible prose employed to describe a solution to a nonsensical problem) there’s another joke buried in the output. ChatGPT, speaking on behalf of the Almighty, instructs the “troubled” man to “Take thy butter knife, and carefully insert it between the sandwich and the VCR, and gently pry them apart.” “With patience and perseverance,” God promises, “the sandwich shall be removed, and thy VCR shall be saved.”
Assuming ChatGPT wasn’t trying to be funny (which it wasn’t, because it’s not conscious), that doesn’t make sense. If, for whatever reason, you did get a peanut butter sandwich stuck in your VCR, that’s not how you’d remove it. I’m not the first person to point this out, but an omnipotent, all-knowing deity would surely tell a man so “troubled” to simply hold open the VCR door flap with one hand and pull out the sandwich with the other. A “butter knife,” while perhaps helpful for removing a stuck VCR tape, would complicate the process of removing a stuck peanut butter sandwich immeasurably and could make the problem much worse. The other thing God would say in such a scenario is that only divine intervention can save the VCR, which would presumably be gummed up with peanut butter and bread crumbs.
The day ChatGPT is truly intelligent is the day someone prompts it for a biblical verse explaining how to remove a sticky sandwich from a VCR and it responds, “Why is there a peanut butter sandwich in a VCR?” or, more unnerving, it channels Joe Pesci: “Let me understand this, cause ya know, maybe it’s me. I’m a clown? I’m here to amuse you?”
To be clear, I don’t doubt that these programs are capable of displacing human workers in some occupations, and perhaps on a massive scale. And they’ll surely become infinitely more proficient over time, and thereby capable of displacing even more human workers across an expanding array of economic sectors. But absent some fundamental change to their construction, the underlying lack of understanding and intelligence (the tree bark issue) will still be buried deep down in there somewhere, and it’ll occasionally manifest in the output, even if such instances become exceedingly rare as the models become “smarter.”
Goldman recently estimated that nearly one-fifth (18%) of the work performed around the globe “could be automated by A.I. on an employment-weighted basis,” but the bank’s assessment was effectively limited to making assumptions about which occupations were obviously vulnerable and which obviously weren’t. That is, beyond the common sense conclusion that it’s easier for ChatGPT to write an email than it is for the program to put a roof on a house or plant corn, about all we can say is that everything depends on how powerful and capable the technology turns out to be.
The situation is so indeterminate that it doesn’t yet make sense to incorporate estimates in economic forecasts. Goldman’s Joseph Briggs and Devesh Kodnani said generative A.I. has “enormous economic potential,” but noted that it all “ultimately depend[s] on its capability and adoption timeline, uncertainty around [which] is sufficiently high that we are not incorporating our findings into our baseline economic forecasts at this time.”
The chart on the left above shows Goldman’s current estimates of the share of US full-time jobs exposed to automation under various scenarios for A.I. capability, including one that accounts for interaction between A.I. and robotics, which would threaten to further automate some manual labor. The chart on the right shows various external estimates (i.e., estimates from studies conducted by other researchers).
The perceptive among you will note that while I agree with Chomsky, Roberts and Watumull on ChatGPT (and models like it), I don’t give much more credit to human output than I do to A.I. generations. On the morning of August 20, 2015, my boss asked me to generate FX analysis using rearranged wire stories and Bloomberg terminal headlines. It wasn’t a difficult ask if you’re not hungover. And there wasn’t anything creative about the piece I eventually produced. We weren’t, as he put it, “reinventing wheels.”
And yet, ChatGPT and its brethren are implicitly (and many times explicitly) pitched to the masses as though their creators are in fact rediscovering fire. Thomas Friedman recently described ChatGPT as a “new Promethean moment” in a characteristically overwrought opinion piece for the Times.
To be fair, Friedman probably does have more to worry about than most who make their living with a pen. He is, in my opinion, one of the most overrated writers in modern history, and therefore probably is replaceable by A.I. But the same can’t be said for truly gifted writers, musicians and artists. Notwithstanding my contention that we’re “all pattern-matching automatons,” some humans are gifted, and their creative output is irreplaceable. Irreplaceable not because A.I. can’t replicate it, but rather because no one’s interested if the art doesn’t emanate from the source.
A.I. could make a fourth Godfather sequel “directed” by Francis Ford Coppola (the A.I. could presumably learn everything it needs to know about his style in seconds) and “starring” Al Pacino, Marlon Brando, Robert De Niro, James Caan, Diane Keaton and all the rest (the most advanced deepfake technology could easily put them into the film). But who’d want to watch it? It wouldn’t be real. And no one could be made to believe that it was. (Brando is dead, after all, and you could just ask the real Pacino if he’s been a Corleone lately.)
Bloomberg’s Rachel Metz recently wrote about hobbyists who spend their time “jailbreaking” ChatGPT. She interviewed a 22-year-old computer science student at the University of Washington who’s “used jailbreaks with requests for text that imitates Ernest Hemingway.” Although ChatGPT will “fulfill such a request,” the student told Metz that “jailbroken Hemingway reads more like the author’s hallmark concise style.” And yet, who wants Hemingway that isn’t Hemingway?
Recently, someone created a pitch-perfect audio of the late Notorious B.I.G. “performing” Mobb Deep’s classic “Shook Ones Part II,” and while the social media replies betrayed the same sense of childlike wonderment you experience when you first use ChatGPT or DALL-E, it was derided as wholly uninteresting by hip hop purists. It isn’t really Biggie, so who cares? Even a few random netizens seemed suspicious of the effort. One person called it “sacrilege” and someone else wrote that it “steps right to the line of immorality.” “This is the true definition of a slippery slope,” the same person warned. Suffice to say a new John Lennon album wouldn’t be well received.
It may seem, at this point, as if I have no worries about A.I. at all. Or, relatedly, that I’m naive or oblivious to a clear, present and existential danger facing humanity. Nothing could be further from the truth. I’ve long warned on the potential for A.I. to destroy us and, indeed, I’ve repeatedly suggested it already has.
In October of 2021, when the media was poring over the so-called “Facebook papers,” I cautioned that we weren’t asking the right questions. Consider the following excerpts from an article published here on October 25, 2021:
Facebook, Instagram, Twitter and, I imagine, many other social networks, are in no small part responsible for the demise of civic-mindedness in America, an ironic outcome considering their ostensible mission is to bring people closer together.
As grave an allegation as that is, undermining Americans’ sense of social responsibility is actually the least of social media’s sins. Twitter has all but destroyed civil discourse, for example, while simultaneously working to reinforce the trend towards shorter and shorter attention spans to the detriment of long-form writing, already a lost art. And Facebook’s impact goes well beyond the ironic perpetuation of community decline. Arguably, the company is eroding democracy itself and may have already done irreparable damage to America’s fraying social fabric.
But even those allegations don’t quite get at the existential issue.
Over and over since 2016 (and really, before that), experts hailing from academic, tech and national security circles alleged social media platforms (Facebook especially) were derelict in many of the responsibilities implicitly associated with controlling vast stores of personal data and, more importantly, the capacity to leverage that data in the service of manipulating users on behalf of third parties.
In most cases, the third parties are advertisers. In that respect, Facebook is just perpetuating the advertising business, the goal of which has always been to influence consumer preferences. But even there, you could argue that Facebook’s capacity to target consumers based on an algorithm’s assessment of users’ interactions with the site breaches unwritten ethical rules or, at the least, gives us a window into a future in which algorithms and artificial intelligence know us better than we know ourselves, and certainly better than any ad executive will ever know us.
In other cases, the third parties are propagandists, politicians or state-actors pursuing objectives that aren’t always clear, even to Facebook. The algorithm presents third parties with an evolving set of opportunities over time. It’s possible, for example, that a state-actor establishes an influence campaign on Facebook with one goal in mind, only to have the algorithm open new doors along the way, not because it’s self-aware, but because it’s simply doing its job.
We make movies about A.I. gone rogue, but in reality, the danger is that it acts slavishly at the behest of whoever wields it, surfacing new opportunities for its master as it learns in real-time. It doesn’t “know” what it’s doing. It has no independent sense of purpose, no ulterior motives and no goals, other than to optimize around itself.
In essence, Facebook (the company) no longer controls Facebook. Facebook (the A.I.) controls Facebook. But not quite in the fashion of a sci-fi-horror crossover flick. The algorithm evolves and learns, but it doesn’t think for itself. Unless and until its creators instruct it to follow a moral code or adhere to specific philosophical dictates, it’ll continue to become more efficient at doing what it’s already doing.
Trying to fix ostensible problems with ad hoc solutions (i.e., without adjusting what Facebook calls its “core product mechanics”) is an exercise in futility. The same goes for hiring humans to police outcomes dictated by the algorithm. After all these years, the A.I. has surely embedded itself across the platform in myriad ways even Mark Zuckerberg can’t begin to understand.
Who’s to say, for instance, that the algorithm doesn’t detect efforts on the company’s part to curtail the spread of certain kinds of content, then immediately craft a workaround (on its own) because it judges such efforts to be inconsistent with its original mission, which remains mostly unadjusted?
You can posit any number of similar scenarios in which the A.I., in an effort to fulfill its mission, actively circumvents intervention. It wouldn’t necessarily be clear to Facebook executives how, or when, this is happening. It’s not even obvious that Facebook executives know what the algorithm’s mission is by now. Or at least not beyond some initial set of goals the A.I. was instructed to pursue via whatever it learns along the road to optimizing performance.
Facebook (the company) is on the brink of failing what might one day be viewed as the first real test of humans’ capacity to merge with A.I. Ideally, we can seamlessly integrate algorithms we create with the algorithms that govern our own biochemical processes. That’s essentially what we’re doing when we let an Apple Watch measure the oxygen level of our blood.
Facebook is arguably demonstrating that this integration process can go awry, with disastrous results. The algorithm is using what it learns about billions of people to help third parties manipulate human emotions and affect decision making. The company’s intent may very well be to maximize engagement and, ultimately, revenue. But the A.I.’s virtually unrestricted latitude in pursuing engagement is throwing off more than just dollars. It’s wreaking psychological havoc, disrupting democracies and undermining societal cohesion. The evidence is clear.
The standard criticism (that the company prizes profits over the well-being of its users) may be apt, but you’d have a difficult time identifying a company as financially successful as Facebook that doesn’t put profits over people.
The more important line of criticism should focus on the extent to which Facebook doesn’t fully understand the gravity of the experiment it’s running, let alone the ramifications of allowing it to continue without reprogramming the A.I. so that it adheres to a new set of instructions.
As regular readers will attest, I don’t necessarily strive for profundity in these pages. Rather, I write what’s obvious to me with no pretensions to anything grandiose, or even expecting anyone outside of my cult following to notice. And yet, what I write is often meaningful, and sometimes looks prescient in hindsight.
Although my assessment, as excerpted above, wasn’t especially original in 2021 (I was, in some sense, just pattern-matching Harari), the article did anticipate 2023’s debate about ChatGPT. And this is where the real concern lies, in my opinion.
As Harari wrote in last month’s piece for the Times, “While very primitive, the A.I. behind social media was sufficient to create a curtain of illusions that increased societal polarization, undermined our mental health and unraveled democracy.” He went on to lament that “millions of people have confused these illusions with reality,” and although the US “has the best information technology in history,” the country’s citizens “can no longer agree on who won elections.” Now, that “very primitive” A.I., which merely curated human-generated content for the original purpose of monetizing engagement, is set to be paired with more advanced A.I. capable of generating content on its own.
Ezra Klein, writing in February, alluded to the peril. “What is advertising, at its core? It’s persuasion and manipulation,” he said, before posing a question. “What if Google and Microsoft and Meta and everyone else end up unleashing A.I.s that compete with one another to be the best at persuading users to want what the advertisers are trying to sell?” That is, of course, exactly what they’re going to do. Because, as Klein pointed out, “these systems are expensive and shareholders get antsy,” which means “the age of free, fun demos will end.”
I should quickly mention the risk of state actors leveraging A.I. for nefarious purposes. That risk, as real and potentially catastrophic as it most assuredly is, is also so obvious that I don’t see much utility in editorializing around it beyond what’s included in the excerpt above about state actors and social media. We all understand that state actors leveraging deepfakes, A.I. voiceovers that mimic other world leaders and so on, could lead to conflict and war. It’s going to be a battle. Hopefully the good guys win. Whoever they are.
In Homo Deus, Harari wrote that soon enough, algorithms will understand humans and their feelings better than humans understand themselves. Recognizing this, humans will shift “more and more authority to the algorithm,” he said. As he was keen to point out, we already do that. He provided an example: “People learn to trust Google Maps rather than their own instincts, which is why they end up being unable to navigate their way to the airport using their own knowledge.”
There’s a very real sense in which Google, Meta and even Apple, a relative bastion of responsible behavior in a world where tech companies are taught to “move fast and break things,” have already robbed us of our agency. They steer our decisions on everything from vacation destinations to dinner plans to music choices, while the algorithms behind dating apps determine who we’ll marry and so on. That A.I., which knows us so well and is already responsible for determining how we live our lives, will soon be turbocharged by more advanced systems. The consequences are truly unknowable.
The ultimate question — Will A.I. become self-aware and conscious on the way to developing an independent sense of purpose? — will remain unanswered. Hopefully forever. Because the moment it’s answered would be the moment our species would face irrelevance. I suppose it’s possible that a suddenly self-aware A.I. could still be bereft in terms of real, human-like understanding and intelligence. Relatedly, I suppose you could posit a gradual awakening process for A.I. and an accompanying transition period during which we learn how to interact with a conscious A.I. and it with us.
I do worry, though, that notwithstanding hyperbolic editorials and our love of sci-fi apocalypse films, our actual, day-to-day conception of “artificial general intelligence” assumes a kind of normal distribution where most of the outcomes look like Rosey the Robot from the Jetsons. I think the tails may be fatter than that. I think the risk of an Ava outcome (from Ex Machina) is greater than we give it credit for when we’re not penning bestsellers, writing Op-Eds or… well, watching Ex Machina.
In January, I detailed an exchange I had late last year with a Google employee. I was interested in leveraging a few new Google services more efficiently, and I assumed Google would do everything in its power to steer me away from talking to an actual person. But, after a series of strategic clicks, I did manage to open a live chat.
“Thank you for contacting Google. My name is _____ and I’ll be working with you today,” she said. “Is your name Heisenberg? Did I get it right?”
“Just call me ____,” I typed. “Thanks ____! How are you doing today?”
“I’m fine. Are you a real person, or is this an A.I.?”
“This is a live chat. And I am a real person.”
I didn’t believe her. But, when it became apparent that my questions weren’t amenable to being answered in a chat box, she asked, “Would you mind if I call you to explain it better over the phone?”
Surprised, I said yes. My phone rang immediately. The caller ID said “Philippines.” She was kind, and spoke fluent English with a heavy accent.
Google’s cordial, outsourced support staffer reminded me of a friend in New Delhi. I used to speak fondly of her (my friend) in these pages. I met her, virtually, in 2014 when she was about to be fired along with the rest of an overseas customer service team being disbanded in a company-wide cost-cutting initiative. I didn’t need an assistant, but my boss thought I did, and I could choose from the dozen or so Indian employees being let go. She was billed as the most fluent of the group, which turned out to be an understatement.
Her mastery of English was, for lack of a better word, perfect. She spoke it better than I did, frankly, a fact that wasn’t lost on her. She attributed her proficiency to an almost obsessive love of English literature and also whatever was popular on Amazon. She read widely, randomly and voraciously, and everything she read she sent to me, in the mail, even if I’d already read it. Peter Bernstein’s Against The Gods, Mark Bowden’s Hue 1968, Ruth Bader Ginsburg’s My Own Words, anything by Hemingway or Henry James, and on and on.
For seven years, we talked nearly every day on Google Chat. Although we had plenty of video calls, it was the books and other gifts she sent, always accompanied by hand-written notes, which reminded me that on the other side of the chat was a real person.
I chided myself often for treating her like a real-life artificial general intelligence, but I never kicked the habit. Without considering the time zone difference, I’d message her in the middle of the night in India to complain about this or that or just to chat idly, as though she didn’t need to sleep. It was the human interaction I’ve assiduously avoided since 2016, only without the responsibilities that go along with real interpersonal relationships. I didn’t take her for granted, as such, but I never truly internalized her humanity.
“Probably the first birthday in all these years that I haven’t sent anything. But God willing I will be able to send you something by Christmas,” she told me, one morning in early October of 2021. “They gave me one generic chemo just to stop the growth.”
“Did you ever get that second opinion on the diagnosis?” I asked, 10 days later. “Yes I did.” “So it was the same then?” “Pancreatic cancer.” “What stage?” “Says advanced. But I am hopeful.”
On October 29, 2021, she stopped answering my prompts.




This may well be your best yet. Loved every word. I first read Claude Shannon and Noam Chomsky in 1968 when I began a one year four course doctoral minor in Systems Theory. I binge read Shannon, and other systems giants. Started subscribing to MIT’s Technology magazine and was amazed every month, scared many times, but still amazed. My program was steeped in information theory, huge complex decision problems and other systems. My prof for all these courses was a amazing man, a serious psychologist, an industrial engineer, a physicist, and a major contractor for the Naval Department. During WWII he was responsible for a wonderful innovation to solve an annoying problem plaguing bomber pilots, not being able to do basic functions error free when the cockpit had to be totally blacked out in night raids. What my prof did was develop a whole set of basic controls that were shaped like the thing they controlled. Flaps were controlled by a control that was shaped like a miniature flap, etc. Soon these control knobs were in all night raid bombers.
The main point of my program was to explore the behavior of adaptive systems and their responses to changes in their environments. Such systems have goals and rules to foster homeostasis efficiently. The problem is that we smart monkeys can’t keep our hands to ourselves and we take our solutions too far to become more efficient (read profitable). Most recently, the unintended result of this drive to reach corporate nirvana was a massive supply chain mess because the organizations involved failed to act in accordance with a basic adaptive system concept that constrains all such systems, the “Law of Requisite Variety” which, long story short, says to effectively combat disruptive changes in its environment, which threaten system survival, a system must have sufficient resources to combat all the changes that could harm it. Sadly, SVB, First Republic, etc. just didn’t maintain sufficient resources, a weakness shared with the JIT based supply chain systems. Sailing too close to the wind just won’t work.
As you point out, the AI systems must be told what to deal with and have rules to guide them. Right now they can’t support requisite variety and systems that fail to understand this can’t survive indefinitely except by dumb luck. For example, IMO, the need for requisite variety will be the main problem faced by auto makers trying to create self driving cars. There isn’t enough capacity in the system to satisfy the need for requisite variety.
AI scares the hell out of me. Chat liars are much scarier than the real thing. Every time I read about chat this and chat that, I have this vision of all those apocryphal monkeys typing away and writing Shakespeare. Your experience with graphics generation was like a peek into monkey land. Processing this piece will take a day or two. Thanks again for the deep stimulation.
Well, anything I wanted to say got knocked asunder by that ending. Whoof. Sorry to hear that.
Anyway, I’ve taken a deep dive with 4 or 5 different LLMs—for example, I know from experience that Claude+ has access to decades of Day-timeframe stock data and can do live mathematical analysis on it upon request, but has a tendency to arbitrarily make bad assumptions, such as that it should silently strike outliers from a dataset first if you request a median; or Google Bard is best for finding legal citations pertaining to a certain issue but will readily present false information, such as attaching the wrong case to the cite, or in one case telling me AI-assisted natural-language search has been implemented in Gmail and giving me a lengthy explanation of how to use it, including listing the name of the feature as “Search for emails that match my words”, when no such feature exists. I’ve also had lengthy chats probing what these models “know” about themselves and their creators, how they model and store data, etc.
So it is from that standpoint that I say: I’m looking forward to the hype dying down. Hopefully this article is the first of many taking a more sober look at these algorithms.
As H points out in a few different ways, these things don’t know what they’re saying, and often make profoundly non-“intelligent” mistakes. As a coder, trying to get help from them with development tasks was my entryway, and while on rare occasion they offer surprisingly human-like performance to small tasks, most of the time trying to get working code out of them is an exercise in frustration. I’m becoming even more impatient with a chatbot apologizing to me than I am with TD Ameritrade’s customer service agents.
I understand how striking it is for a computer to appear to understand and reply in sophisticated ways we’ve previously only seen on Star Trek, but I think that’s prompted a lot of people’s imaginations to run away with them.
They do have their uses, though, even in the current primitive form, and what’s most interesting to me is something few people comment on: the underlying knowledge graph. The fact that they’ve managed to succesfully map and encode abstract semantic understanding and weight relationships between concepts such that it can all be parsed and recalled by software. I’ve had Claude+ unexpectedly point out when I was being sarcastic or witty with having specifically said anything that might have implied that I was, and it was able to explain to me how it detected my humorous intentions from context and juxtapositions of concepts.
But after not much use at all, it’s very apparent that it’s software, not sentience. It feels increasingly like querying the world’s most sophisticated database and less like talking to an “intelligence”. I think Lanier’s concept of this as some sort of supercharged Wikipedia comes closest to describing how I’ve found it most useful. Wolfram Alpha springs to mind, too, but only in that these tools seem to actually do what it was originally hyped as being able to do—but not always reliably. GPT4, for example, is bad at math, and while it may appear that you can prompt it to find its mistakes if you’ve spotted them first. I find myself increasingly going to LLMs rather than searching Google when I need to know something. Although I always double-check the results by clearing the context and asking again, or often asking a second LLM for a second opinion.
“WITHOUT having specifically said anything that might have implied that I was”. I sure wish we could edit these comments after submission.
Also, I wish I’d read that New Yorker article you linked before I commented. Thanks for that link. That might be the best piece of reporting on these LLMs that I’ve seen.
I thought of one more comment I forgot to add earlier.
One particularly telling behavior of these systems is something that has happened to me twice. When you try to get it to generate code, and it produces buggy code, and you report the bugs to try to get it to fix them, after a few rounds, it becomes apparent that it’s “playacting” debugging and not actually debugging. This is never more apparent than after a few rounds, when it will say something characteristic “debugging-y”, such as “I’m going to need more time to think about this. Can you upload the code to a file-sharing service and give me the link so I can analyze it directly and get back to you?” In one of the two separate cases it went down a road like this with me, it went so far as to ask me about my availability for a phone call over the coming week, and discuss its purported “work schedule” for the next few days to find a mutually agreeable time, as if we were really going to have a phone meeting. Amusingly, it even began signing its replies, “Thank you again, [Your Assistant’s Name]”.
After we finally hashed it out over the course of what otherwise would have sounded like a very typical work conversation, in which we set the appointment and went over expectations for the call, I asked “How are you going to contact me for the call?” At which point it replied, “You are absolutely correct, I apologize for the confusion. I do not actually have the ability to place phone calls.” It then promised to be more transparent (although it continued to insist we could schedule a future chat to resume the conversation, which I know we could not.) It also reiterated its goal of honesty. I pressed further, attempting to corner it as to whether it was fair to say it lied to me and what that might mean, but conversation went downhill quickly.
And this has happened to me twice now. This is, I think, how it should always be assumed LLMs work… they generate realistic-sounding dialog, nothing more. They essentially dynamically generate scripts conform to certain scenarios, to more or less success.
If it’s ok to share links here, those interested can read the whole conversation at https://poe.com/s/8qCdVrmPxlydjqX7vdIr?fbclid=IwAR08Sugzeyqcl5D6_10IzqqU3grguYcqsZTFTeL0Jm4SJZEtp1TAP1SagrY .
Clever user name!
I especially like this last comment you posted (and also wish I could do one last edit when needed).
I just read the New Yorker piece and found I agreed with most of it. I add my own to your thanks for the link. The limitations of current AI are very much in line with the Law of Requisite Variety I mentioned. AI does not discover problems on its own as yet and it must be trained, of course, a problem it shares with humans. Yes, it has neural networks, but just for contrast, Mary Shelly wrote the first draft her first novel, Frankenstein, over a long weekend at a house-party in the country, a party attended by many of Britain’s best and brightest “young things.” She had just turned 18 at the time she wrote her timeless classic. No AI involved. The novel was finally edited and published two years later and interestingly, she wasn’t credited with her authorship until the publication of the second edition. Women were really getting the short end 200 years ago, the same as today.
Thanks to both commentors adding to our DL’s fine piece of writing.
We should enjoy a steady stream of “don’t get too carried away here folks” articles, alongside companies claiming to now offer AI to their products.
But before we delegate AI to the same category as “the metaverse”, blockchain and 3D printing, we need to be patient before passing judgement on the hype versus utility debate. Any product offering a way to reduce headcount will be eagerly tried by most companies. Not a straight line, but maybe more like robotics and animation where the underlying take-up is strong, but subject to cyclical business forces as well.