Faux ScarJo and the Descent of the A.I. Vultures
On May 13th, during a live event, the artificial-intelligence company OpenAI unveiled the next generation of its technology, GPT-4o, the successor to GPT-3. When OpenAI first released its product to the public in late 2022, as the text-based tool ChatGPT, it nearly single-handedly ushered in the A.I. era. The latest version is far more powerful still. The “o” in the name stands for “omni”; the model can communicate seamlessly across various forms of media at once, including text, audio, and video, receiving prompts in one medium and responding in another. It can maintain a memory of everything you tell it. Most strikingly, it can talk out loud to you in real time. The voice assistant featured in the demo, making up bedtime stories and analyzing facial expressions, sounded, as many observers noted, a lot like the A.I. companion in the 2013 film “Her,” played by Scarlett Johansson. After the event, Sam Altman, the C.E.O. of OpenAI, gnomically posted the film’s title on X, the Web site previously known as Twitter. Then, without much explanation, the company removed the voice from its app. We found out why on Monday, when the actual Scarlett Johansson released a statement explaining that OpenAI had approached her about licensing her voice, and that she had turned the company down. “When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine,” Johansson wrote. (In a response released Monday night, Altman maintained that “the voice of Sky is not Scarlett Johansson’s” and “was never intended to resemble hers.”)
The snafu might seem funny if it didn’t portend a larger crisis for the integrity of digital information in the age of A.I. Google also hosted an event last week, its annual developer conference on May 14th, to introduce its latest round of A.I. additions under the label of Gemini. Watching both tech brands show off their new tools, I felt only a sense of dread. The two companies are racing to put into place a future of the Internet in which A.I. plays the role of an eager but not entirely expert intern, collating research and presenting an only semi-trustworthy overview of content based on users’ inquiries. You can determine the quality of any given answer only if you check its work. What leaves me so depressed is the fact that Google and OpenAI are training their machines using the Internet’s decades-old trove of material with no apparent concern for the sources of that material—that is, the people who did the work of putting it online in the first place, the minds and faces and voices which generated it.
OpenAI is leaning into the idea of the Internet as a rounded, humanoid personality. The faux-ScarJo voice the company demoed, nicknamed Sky, is low, warm, a little flirtatious, and prone to breaking into giggles. It jokes, pauses, hmms, and can increase the drama of its delivery on demand. (Alternative persona options named Breeze, Cove, Ember, and Juniper sound less blatantly feminine.) By comparison, the original text-based ChatGPT is as charming as a calculator. The vocal element places OpenAI in a territory currently occupied by startups such as Replika and Character.AI which offer A.I. companions. But, whereas those other companies are selling the semblance of emotional connection, OpenAI is using the same approach with the promise of delivering reliable information. The problem is that A.I. is adept at the former but still mediocre at the latter. What we are left with is a tool that sounds far more convincingly intelligent than it is.
To bring about its hypothetical future, OpenAI must build a new digital ecosystem, pushing users toward the ChatGPT app or toward preëxisting products that integrate its technology such as Bing, the search engine run by OpenAI’s major investor, Microsoft. Google, by contrast, already controls the technology that undergirds many of our online experiences, from search and e-mail to Android smartphone-operating systems. At its conference, the company showed how it plans to make A.I. central to all of the above. Some Google searches now yield A.I.-generated “Overview” summaries, which appear in tinted boxes above any links to external Web sites. Liz Reid, Google’s head of search, described the generated results with the ominously tautological tagline “Google will do the Googling for you.” (The company envisions that you will rely on the same search mechanism to trawl your own digital archive, using its Gemini assistant to, say, pull up photos of your child swimming over the years or summarize e-mail threads in your in-box.)
Nilay Patel, the editor-in-chief of the tech publication the Verge, has been using the phrase “Google Zero” to describe the point at which Google will stop driving any traffic to external Web sites and answer every query on its own with A.I. The recent presentations made clear that such a point is rapidly approaching. One of Google’s demonstrations showed a user asking the A.I. a question about a YouTube video on pickleball: “What is the two-bounce rule?” The A.I. then extracted the answer from the footage and displayed the answer in writing, thus allowing the user to avoid watching either the video or any advertising that would have provided revenue to its creator. When I Google “how to decorate a bathroom with no windows” (my personal litmus test for A.I. creativity), I am now presented with an Overview that looks a lot like an authoritative blog post, theoretically obviating my need to interact directly with any content authored by a human being. Google Search was once seen as the best path for getting to what’s on the Web. Now, ironically, its goal is to avoid sending us anywhere. The only way to use the search function without seeing A.I.-generated content is to click a small “More” tab and select “Web” search. Then Google will do what it was always supposed to do: crawl the Internet looking for URLs that are relevant to your queries, and then display them to you. The Internet is still out there, it’s just increasingly hard to find.
If A.I. is to be our primary guide to the world’s information, if it is to be our 24/7 assistant-librarian-companion as the tech companies propose, then it must constantly be adding new information to its data sets. That information cannot be generated by A.I., because A.I. tools are not capable of even one iota of original thought or analysis, nor can they report live from the field. (An information model that is continuously updated, using human labor, to inform us about what’s going on right now—we might call it a newspaper.) For a decade or more, social media was a great way to motivate billions of human beings to constantly upload new information to the Internet. Users were driven by the possibilities of fame and profit and mundane connection. Many media companies were motivated by the possibility of selling digital ads, often with Google itself as a middle man. In the A.I. era, in which Google can simply digest a segment of your post or video and serve it up to a viewer, perhaps not even acknowledging you as the original author, those incentives for creating and sharing disappear. In other words, Google and OpenAI seem poised to cause the erosion of the very ecosystem their tools depend on.
There are possible solutions to this problem. OpenAI has negotiated licensing deals with several media companies which will provide journalists with some amount of funding—likely far too little—to keep creating the grist that gets fed into the A.I. mills. In interviews, Altman has suggested that A.I. might eventually become a form of universal basic income, in which “everybody gets a slice.” Perhaps all of Internet-using humanity will one day receive micro-royalties for our small contributions to the digital data trove. Wouldn’t it be wonderful if Google Zero ushered in an era of shared prosperity? More realistically, A.I. companies will keep taking and replicating whatever they can for free, in a rush to create new user habits that might become profitable at some point down the line. In a way, we are all Scarlett Johansson, waiting to be confronted with an uncanny reflection of ourselves that was created without our permission and from which we will reap no benefit. ♦