Earlier this month, Ben Affleck made headlines while discussing why he believes A.I. lacks the “taste” to pose a threat to art forms like film and poetry. “A.I. can write you excellent imitative verse that sounds Elizabethan; it cannot write you Shakespeare,” said the actor star while speaking at 2024 CNBC Delivering Alpha.
A new study from researchers at the University of Pittsburgh begs to differ. The report found that participants were not only unable to differentiate between A.I.-generated and human-authored poems, but actually preferred those created by A.I. “As A.I.-generated text continues to evolve, distinguishing it from human-authored content has become increasingly difficult,” wrote the study’s authors Brian Porter and Edouard Machery.
After gathering work from ten well-known poets, including William Shakespeare, Walt Whitman, Emily Dickinson and Allen Ginsburg, the researchers used OpenAI’s ChatGPT 3.5 to generate poems “in the style of” each writer. After randomly presenting five A.I.-generated and five authentic works to more than 1,600 participants, they asked for judgments on whether the work was created by a human or large language model (LLM).
Participants were not only largely unable to correctly identify A.I.-generated works, but were more likely to judge them as human-authored than those penned by real-life poets. The five poems with the lowest “human” ratings were all written by humans, while four out of the five poems with the highest “human” ratings were created by A.I.
Why were A.I.-generated poems rated more favorably?
The study additionally asked more than 600 humans to evaluate more than a dozen qualitative dimensions of works ranging from rhythm to originality and wittiness. A.I. verses were overall rated more favorably, although ratings went down when participants were told that poems were A.I.-generated ahead of time. “Non-poetry readers prefer the more accessible A.I.-generated poetry, which communicate emotions, ideas and themes in more direct and easy-to-understand language, but expect A.I.-generated poetry to be worse; they therefore mistakenly interpret their own preference for a poem as evidence that it is human-written,” according to Porter and Machery.
The authors noted that increasingly positive interpretations of A.I.-generated poems is a recent phenomenon spurred on by advances in LLM. Previous research using poetry created with OpenAI’s GPT-2, for example, saw participants able to easily differentiate A.I. and human-authored works.
The impacts of A.I. developments aren’t limited to poetry. Recent studies have found that A.I.-generated paintings are increasingly rated higher than-human created ones, while A.I. jokes have been received as equally funny or even funnier than human responses. Even faces produced by the new technology are becoming more and more difficult to tell apart from those of humans.
Now, similar progress is being witnessed with the written word. “These findings signal a leap forward in the power of generative A.I.,” wrote the study’s authors, adding that “poetry had previously been one of the few domains” in which A.I. models had not reached such levels of indistinguishability.