This story is syndicated from the Substack newsletter Big Technology; subscribe for free here.
Recently, a new Substack called The Rationalist lifted analysis and writing directly from Big Technology. Its plagiarized post on the ‘Creator Economy’—which we’d covered days prior—went viral, hitting the front page of Hacker News and sparking a conversation with more than 80 comments. It would’ve been a terrific debut for any publication, if it was authentic.
What made the case of The Rationalist particularly striking, though, was its author—an avatar by the name of “PETRA”—admitted they’d used AI tools to produce the story, including those from OpenAI, Jasper, and Hugging Face. The speed at which they were able to copy, remix, publish, and distribute their inauthentic story was impressive. It outpaced the platforms’ ability, and perhaps willingness, to stop it, signaling Generative AI’s darker side will be difficult to tame.
“It’s really hard to predict all the maleficent uses,” said Giada Pistilli, principal ethicist at Hugging Face. “We try to anticipate all the risks, but it’s always super hard.”
The Rationalist is an odd publication. It has no mission. No named authors outside of PETRA. It’s been live for a week. And yet two days after it went live, it was lifting passages directly from Big Technology.
Here, for instance, is a Big Technology clause from last week’s story:
With the days of zero-interest-rate froth ending, the investments are becoming more difficult to justify.
And here’s The Rationalist, two days later:
With the end of zero-interest-rate froth, these investments are becoming more difficult to justify.
Here’s another clause from Big Technology:
Online content creation is still mostly viable for the very top echelon of online creators
And here again The Rationalist, two days later:
Only the top echelon of creators are able to make a viable income
A flashy headline—”The creator economy: the top 1% and everyone else”—helped propel The Rationalist’s story to the Hacker News front page, a position typically worth thousands of views. The core of the story, lifted from Big Technology, was good enough to spark a discussion.
Yet as Hacker News users read through, they noticed something was off. “The whole article feels to me like it’s generated by GPT-3 based on a few prompts,” wrote one user. “This wasn’t written by a person,” said another. Then, Petra confessed. “If you are from hacker news, here are the tools I used to improve the readability,” they said before listing OpenAI, Jasper, and Hugging Face. The tools enable AI writing and likely helped remix the original article. PETRA, who did not respond to a request for comment, didn’t mention the content originated with another publication.
As the story circulated, the tech platforms assisting The Rationalist stood still. OpenAI shared a generic statement that included the line, ‘Our policies require that users be up-front with their audience.” Hugging Face admitted it had no way of finding the offending user, though Pistilli seemed grateful to be alerted. And Substack promised to investigate.
Substack said it has a policy against plagiarism, which Merriam-Webster defines as “to steal and pass off (the ideas or words of another) as one’s own.” Yet while this case fits the definition, Substack decided to let The Rationalist’s post stand. “At this time we’re unable to conclude with certainty that the post violates our plagiarism policy,” said Substack spokesperson Helen Tobin.
Given The Rationalist’s success, more advanced efforts to copy and remix others’ work with AI will likely take place. And it should be easy to improve. The Rationalist was sloppy, lifting clauses word for word. But as publications with similar intent refine their systems, they’ll be able to remove all traces of the original writing and just pass along the ideas. And it shouldn’t be hard to automate either.
Imagine AI remixing the Financial Times’ ten most-read stories of the day—or The Information’s VC coverage—and making the reporting available sans paywall. AI is already writing non-plagiarized stories for publications like CNET. At a certain point, some publishers will cut corners.
There’s no quick technological fix to these issues. As has been the case for nearly all instances of bad information spreading online, readers and editors will again have to figure this out themselves. “Our competitors rip us off all the time, essentially remixing stuff and sharing,” said The Information CEO Jessica Lessin. “The Information subscribers are smart to get it from the source. But I am watching all this with fascination of course.”