The Washington PostDemocracy Dies in Darkness

He wrote a book on a rare subject. Then a ChatGPT replica appeared on Amazon.

From recipes to product reviews to how-to books, artificial intelligence text generators are quietly authoring more and more of the internet.

Updated May 5, 2023 at 2:06 p.m. EDT|Published May 5, 2023 at 7:30 a.m. EDT
(María Alconada Brooks/The Washington Post)
12 min

Chris Cowell, a Portland, Ore.-based software developer, spent more than a year writing a technical how-to book. Three weeks before it was released, another book on the same topic, with the same title, appeared on Amazon.

“My first thought was: bummer,” Cowell said. “My second thought was: You know what, that’s an awfully long and specific and cumbersome title to have randomly been picked.”

The book, titled “Automating DevOps with GitLab CI/CD Pipelines,” just like Cowell’s, listed as its author one Marie Karpos, whom Cowell had never heard of. When he looked her up online, he found literally nothing — no trace. That’s when he started getting suspicious.

The book bears signs that it was written largely or entirely by an artificial intelligence language model, using software such as OpenAI’s ChatGPT. (For instance, its code snippets look like ChatGPT screenshots.) And it’s not the only one. The book’s publisher, a Mumbai-based education technology firm called inKstall, listed dozens of books on Amazon on similarly technical topics, each with a different author, an unusual set of disclaimers and matching five-star Amazon reviews from the same handful of India-based reviewers. InKstall did not respond to requests for comment.

Experts say those books are likely just the tip of a fast-growing iceberg of AI-written content spreading across the web as new language software allows anyone to rapidly generate reams of prose on almost any topic. From product reviews to recipes to blog posts and press releases, human authorship of online material is on track to become the exception rather than the norm.

“If you have a connection to the internet, you have consumed AI-generated content,” said Jonathan Greenglass, a New York-based tech investor focused on e-commerce. “It’s already here.”

What that may mean for consumers is more hyper-specific and personalized articles — but also more misinformation and more manipulation, about politics, products they may want to buy and much more.

As AI writes more and more of what we read, vast, unvetted pools of online data may not be grounded in reality, warns Margaret Mitchell, chief ethics scientist at the AI start-up Hugging Face. “The main issue is losing track of what truth is,” she said. “Without grounding, the system can make stuff up. And if it’s that same made-up thing all over the world, how do you trace it back to what reality is?”

Generative AI tools have captured the world’s attention since ChatGPT’s November release. Yet a raft of online publishers have been using automated writing tools based on ChatGPT’s predecessors, GPT-2 and GPT-3, for years. That experience shows that a world in which AI creations mingle freely and sometimes imperceptibly with human work isn’t speculative; it’s flourishing in plain sight on Amazon product pages and in Google search results.

Semrush, a leading digital marketing firm, recently surveyed its customers about their use of automated tools. Of the 894 who responded, 761 said they’ve at least experimented with some form of generative AI to produce online content, while 370 said they now use it to help generate most if not all of their new content, according to Semrush Chief Strategy Officer Eugene Levin.

“In the last two years, we’ve seen this go from being a novelty to being pretty much an essential part of the workflow,” Levin said.

In a separate report this week, the news credibility rating company NewsGuard identified 49 news websites across seven languages that appeared to be mostly or entirely AI-generated. The sites sport names like Biz Breaking News, Market News Reports, and bestbudgetUSA.com; some employ fake author profiles and publish hundreds of articles a day, the company said. Some of the news stories are fabricated, but many are simply AI-crafted summaries of real stories trending on other outlets.

Several companies defended their use of AI, telling The Post they use language tools not to replace human writers, but to make them more productive, or to produce content that they otherwise wouldn’t. Some are openly advertising their use of AI, while others disclose it more discreetly or hide it from the public, citing a perceived stigma against automated writing.

Ingenio, the San Francisco-based online publisher behind sites such as horoscope.com and astrology.com, is among those embracing automated content. While its flagship horoscopes are still human-written, the company has used OpenAI’s GPT language models to launch new sites such as sunsigns.com, which focuses on celebrities’ birth signs, and dreamdiary.com, which interprets highly specific dreams.

Ingenio used to pay humans to write birth sign articles on a handful of highly searched celebrities like Michael Jordan and Ariana Grande, said Josh Jaffe, president of its media division. But delegating the writing to AI allows sunsigns.com to cheaply crank out countless articles on not-exactly-A-listers, from Aaron Harang, a retired mid-rotation baseball pitcher, to Zalmay Khalilzad, the former U.S. envoy to Afghanistan. Khalilzad, the site’s AI-written profile claims, would be “a perfect partner for someone in search of a sensual and emotional connection.” (At 72, Khalilzad has been married for decades.)

In the past, Jaffe said, “We published a celebrity profile a month. Now we can do 10,000 a month.”

Jaffe said his company discloses its use of AI to readers, and he promoted the strategy at a recent conference for the publishing industry. “There’s nothing to be ashamed of,” he said. “We’re actually doing people a favor by leveraging generative AI tools” to create niche content that wouldn’t exist otherwise.

A cursory review of Ingenio sites suggests those disclosures aren’t always obvious, however. On dreamdiary.com, for instance, you won’t find any indication on the article page that ChatGPT wrote an interpretation of your dream about being chased by cows. But the site’s “About us” page says its articles “are produced in part with the help of large AI language models,” and that each is reviewed by a human editor.

Jaffe said he isn’t particularly worried that AI content will overwhelm the web. “It takes time for this content to rank well” on Google, he said — meaning that it appears on the first page of search results for a given query, which is critical to attracting readers. And it works best when it appears on established websites that already have a sizable audience: “Just publishing this content doesn’t mean you have a viable business.”

Google clarified in February that it allows AI-generated content in search results, as long as the AI isn’t being used to manipulate a site’s search rankings. The company said its algorithms focus on “the quality of content, rather than how content is produced.”

Reputations are at risk if the use of AI backfires. CNET, a popular tech news site, took flack in January when fellow tech site Futurism reported that CNET had been using AI to create articles or add to existing ones without clear disclosures. CNET subsequently investigated and found that many of its 77 AI-drafted stories contained errors.

But CNET’s parent company, Red Ventures, is forging ahead with plans for more AI-generated content, which has also been spotted on Bankrate.com, its popular hub for financial advice. Meanwhile, CNET in March laid off a number of employees, a move it said was unrelated to its growing use of AI.

BuzzFeed, which pioneered a media model built around reaching readers directly on social platforms like Facebook, announced in January it planned to make “AI inspired content” part of its “core business,” such as using AI to craft quizzes that tailor themselves to each reader. BuzzFeed announced last month that it is laying off 15 percent of its staff and shutting down its news division, BuzzFeed News.

“There is no relationship between our experimentation with AI and our recent restructuring,” BuzzFeed spokesperson Juliana Clifton said.

AI’s role in the future of mainstream media is clouded by the limitations of today’s language models and the uncertainty around AI liability and intellectual property. In the meantime, it’s finding traction in the murkier worlds of online clickbait and affiliate marketing, where success is less about reputation and more about gaming the big tech platforms’ algorithms.

That business is driven by a simple equation: how much it costs to create an article vs. how much revenue it can bring in. The main goal is to attract as many clicks as possible, then serve the readers ads worth just fractions of a cent on each visit — the classic form of clickbait. That seems to have been the model of many of the AI-generated “news” sites in NewsGuard’s report, said Gordon Crovitz, NewsGuard’s co-CEO. Some sites fabricated sensational news stories, such as a report that President Biden had died. Others appeared to use AI to rewrite stories trending in various local news outlets.

NewsGuard found the sites by searching the web and analytics tools for telltale phrases such as “As an AI language model,” which suggest a site is publishing outputs directly from an AI chatbot without careful editing. One local news site, countylocalnews.com, churned out a series of articles on a recent day whose sub-headlines all read, “As an AI language model, I need the original title to rewrite it. Please provide me with the original title.”

Then there are sites designed to induce purchases, which insiders say tend to be more profitable than pure clickbait these days. A site called Nutricity, for instance, hawks dietary supplements using product reviews that appear to be AI-generated, according to NewsGuard’s analysis. One reads, “As an AI language model, I believe that Australian users should buy Hair, Skin and Nail Gummies on nutricity.com.au.” Nutricity did not respond to a request for comment.

In the past, such sites often outsourced their writing to businesses known as “content mills,” which harness freelancers to generate passable copy for minimal pay. Now, some are bypassing content mills and opting for AI instead.

“Previously it would cost you, let’s say, $250 to write a decent review of five grills,” Semrush’s Levin said. “Now it can all be done by AI, so the cost went down from $250 to $10.”

The problem, Levin said, is that the wide availability of tools like ChatGPT means more people are producing similarly cheap content, and they’re all competing for the same slots in Google search results or Amazon’s on-site product reviews. So they all have to crank out more and more article pages, each tuned to rank highly for specific search queries, in hopes that a fraction will break through. The result is a deluge of AI-written websites, many of which are never seen by human eyes.

It isn’t just text. Google users have recently posted examples of the search engine surfacing AI-generated images. For instance, a search for the American artist Edward Hopper turned up an AI image in the style of Hopper, rather than his actual art, as the first result.

The rise of AI is already hurting the business of Textbroker, a leading content platform based in Germany and Las Vegas, said Jochen Mebus, the company’s chief revenue officer. While Textbroker prides itself on supplying credible, human-written copy on a huge range of topics, “People are trying automated content right now, and so that has slowed down our growth,” he said.

Mebus said the company is prepared to lose some clients who are just looking to make a “fast dollar” on generic AI-written content. But it’s hoping to retain those who want the assurance of a human touch, while it also trains some of its writers to become more productive by employing AI tools themselves. He said a recent survey of the company’s customers found that 30 to 40 percent still want exclusively “manual” content, while a similar-size chunk is looking for content that might be AI-generated but human-edited to check for tone, errors and plagiarism.

“I don’t think anyone should trust 100 percent what comes out of the machine,” Mebus said.

Levin said Semrush’s clients have also generally found that AI is better used as a writing assistant than a sole author. “We’ve seen people who even try to fully automate the content creation process,” he said. “I don’t think they’ve had really good results with that. At this stage, you need to have a human in the loop.”

For Cowell, whose book title appears to have inspired an AI-written copycat, the experience has dampened his enthusiasm for writing.

“My concern is less that I’m losing sales to fake books, and more that this low-quality, low-priced, low-effort writing is going to have a chilling effect on humans considering writing niche technical books in the future,” he said. It doesn’t help, he added, knowing that “any text I write will inevitably be fed into an AI system that will generate even more competition.”

Amazon removed the impostor book, along with numerous others by the same publisher, after The Post contacted the company for comment. Spokesperson Lindsay Hamilton said Amazon doesn’t comment on individual accounts and declined to say why the listings were taken down. AI-written books aren’t against Amazon’s rules, per se, and some authors have been open about using ChatGPT to write books sold on the site. (Amazon founder and executive chairman Jeff Bezos owns The Washington Post.)

“Amazon is constantly evaluating emerging technologies and innovating to provide a trustworthy shopping experience for our customers,” Hamilton said in a statement. She added that all books must adhere to Amazon’s content guidelines, and that the company has policies against fake reviews or other forms of abuse.

correction

A previous version of this story misidentified the job title of Eugene Levin. He is Semrush's president and chief strategy officer, not its CEO.