Wikisonnet Composes Poetry With Your Wikipedia Contributions

The 2015 Nobel literature laureate Svetlana Alexievich of Belarus receives the award from King Carl Gustaf of Sweden. Now just imagine a robot in her place.

The 2015 Nobel literature laureate Svetlana Alexievich of Belarus receives the award from King Carl Gustaf of Sweden. Now just imagine a computer in her place. JONAS EKSTROMER/AFP/Getty Images

If you’re like me, when the other kids in high school were experimenting with illegal drugs, you were experimenting with poetry. It’s embarrassing, but it’s true. I spent most of a semester revising a villanelle in high school. To make matters worse, I never wrote a proper sonnet, because I just couldn’t think in terms of iambic pentameter.

Anyone who covered Shakespeare during their school years has learned about iambic pentameter. It is a rhythm of stressed and unstressed syllables favored by the bard and poets of his time; today, it is a concept students memorize for the test and forget forever, but a proper sonnet is written entirely in that rhythm. I couldn’t do it.

In a phone call with the Observer, Cassie Tarakajian said that it’s tremendously difficult to cram your words into that da-DUM da-DUM pattern “when you sit down as a human to write an Elizabethan sonnet.” Nevertheless, she and her collaborators, Ana Giraldo-Wingler and Sam Tarakajian, built a website that constructs sonnets from sentences or parts of sentences found on Wikipedia that happened to get written in iambic pentameter. It scans the encyclopedia and uses natural language processing to identify ten syllable strings of words that fit the scheme.

Wikisonnet

“Yogurt” by Wikisonnet, tweeted from @Wikison.

“It’s crazy that so many of these authors unintentionally compose it,” Giraldo-Wingler said.

(Unless there are any Wikipedia contributors who intentionally write in iambic pentameter, and then we definitely want to hear from you.)

Their project is called Wikisonnet. It has written 821 sonnets, as of this writing. One of its most recent sonnets, “Yogurt,” was written after this reporter—inspired by the decade-long battle over how to spell the word on its main Wikipedia page—tweeted to its Twitter account to request a poem on the topic. As computer generated sonnets go, it’s tough to beat.

“It tries to write a whole poem using the lines that are on that page, first,” Sam explained. “I don’t think there’s a single page for which that is actually possible.”

So, when it has exhausted that first page, the software looks for lines in iambic pentameter on related pages. Natural language processing also helped to fit pages into similar categories. This thematic connection jumps out at visitors to the website, because hovering over any particular line reveals context for that passage, showing which Wikipedia page it originally appears on.

So, scanning through the 14 lines based on the superhero “Captain America,” the software found lines for the poem on the pages of the superheroes Manhunter, Sun Girl and Human Torch.

SEE ALSO: Leave the writing to artificial intelligence.

Open source libraries in Python helped them to answer many of the other questions they needed for the project. The Natural Language Toolkit (NLTK) was the workhorse, solving problems like rhymes and stressed syllables. Other tools they used included TextBlob, Pattern and mwparserfromhell.

One thing readers may notice is that from line-to-line, it makes more sense than one might expect from chunks of sentences that were not meant to go together concatenated by a rhyme scheme. There is a reason.

“We made up this rule that has no basis in fact,” Sam explained. Basically, the system looks at the next word following any passage it selects and identifies that next word’s part of speech. Then it tries to choose a chunk for the next line that begins with a word that is the same part of speech as that unseen word.

Using the first two lines of the second stanza of “Yogurt,” here’s an example:

Consumers wanting sweetened yogurt are
arisen to provide complete protein

On the page where that first line originally appeared, the next word is “advised,” a verb, like “arisen,” which begins the following line. It works surprisingly well, especially since modern readers don’t expect poetry to entirely make sense anyway.

The Girlfriends team, Cassie and Sam Tarakjian and Ana Winger-G...

The Girlfriends team, Cassie and Sam Tarakjian and Ana Giraldo-Wingler. Courtesy of Girlfriends

The tension between art and experiment was a conversation topic for the team.  “I remember there was this question of whether we were creating legitimate art or was it a gimmicky funny thing,” Giraldo-Wingler said. 

Some of the poems are artistic. Some are funny. “And sometimes they can be bad,” Cassie said.

The idea is to further some continually evolving conversation about what poetry is,” Sam concluded. 

There are a lot of projects out there now where computers write, as we’ve previously reported. Wikisonnet stands out because “you can look and see where the sentence came from,” Cassie said. It isn’t writing the lines. It’s writing by arranging excerpts.

We met the three creators at Internet Yami-Ichi, in Queens, which we have visited for the last two years. At that event, they were selling copies of poems from the project handwritten on paper. “People are excited by the junction of human creativity and computer generated things,” Giraldo-Wingler said.

The brother, sister and friend team have formed a new company aimed at doing collaborative work for art and for clients, called Girlfriends. When we asked about the bot’s chances for enduring literary fame, they were pretty skeptical, but it also hadn’t composed the “Yogurt” poem yet. Just wait till that gets in front of the King of Sweden.

Article continues below
More from Culture
Visitors at Art Basel Miami Beach.
You Missed Charliewood, A Dollywood-Inspired Performance Piece, With Butt Plugs