This article is syndicated from the Substack newsletter Big Technology; subscribe for free here.
Dall-E’s power becomes evident within the first minute of seeing it. The AI program creates intricate, original images when you feed it short text prompts. Its only limit is your imagination. On Wednesday, I watched live as an employee of OpenAI—which created it—asked Dall-E to draw a “Rabbit prison warden, digital art,” and, within twenty seconds, it produced ten new illustrations. All were professional grade.
At first, Dall-E inspires awe, then reverence kicks in. Still in research mode, the program is expanding fast. OpenAI is granting access to up to 1,000 new users each week, and Dall-E’s drawn 3 million images since April. In our modern, visual culture, there’s little doubt this technology—or some variation—will go mainstream. And soon, every internet user will likely have the capacity to share ideas, or shape perceptions, in profound new ways by using it. We’re just starting to grasp the implications.
“Dall-E 2 right now is in a research preview mode,” Lama Ahmad, a policy researcher at OpenAI, told me. “To understand—What are the use cases? What kind of interest is there? What kind of beneficial use cases are there? And then, at the same time, learning about the safety risks.”
How will Dall-E be used?
The notion that Dall-E—officially named Dall-E 2, for its second iteration—will replace professional illustrators is unlikely, but its amateur uses are more intriguing. The demand for quality art exceeds illustrators’ ability to deliver it, and Dall-E can fill the gap. OpenAI already uses Dall-E to illustrate PowerPoints, and countless web articles that use stock images are good candidates for it as well. Memes, fan art, and marketing materials could also use Dall-E. Start dreaming up possibilities, and it isn’t easy to stop.
After taking suggestions on social media this week, I worked with Open-AI to have Dall-E draw several astounding illustrations. They included a town square in the lost city of Atlantis, Ikea instructions for the iPhone, and a barren landscape with tree branches growing golden pocket watches. Dall-E uses artificial intelligence that understands images and their text descriptions, and the relationship between objects, like the fact that a human can sit on a chair. Using this knowledge, it can produce each illustration with a single string of text.
If Dall-E-style images become ubiquitous, whoever controls the technology will be in a pretty influential position. Steering Dall-E’s results could shape perceptions in a society where those results are everywhere. OpenAI is taking this responsibility seriously, as evidenced by its slow Dall-E rollout and its careful content policy. But the product will still reflect its values, and that’s where things get fascinating.
Does Dall-E contain biases?
Dall-E delivers ten images for each request, and when you see results that contain sensitive or biased content, you can flag them to OpenAI for review. The question then becomes whether OpenAI wants Dall-E’s results to reflect society’s approximate reality or some idealized version. If an occupation is majority male or female, for instance, and you ask Dall-E to illustrate someone doing that job, the results can either reflect the actual proportion in society, or some even split between genders. They can also account for race, weight, and other factors. So far, OpenAI is still researching how exactly to structure these results. But as it learns, it knows it has choices to make.
“Bias is a really complicated and tricky question. And no matter what decisions we make, we are making a decision about how we present the world,” said Ahmad. “Our model doesn’t claim at any point to represent the real world,” she said, adding that one goal is to teach people how to search with precision. So, if someone wants images of a Muslim woman CEO, she said, they can type that in, instead of asking for generic CEO images. Right now, Dall-E tends to draw CEOs as males.
Having AI change its representations of the world may sound appealing to some, but it’s a fraught topic without straightforward answers. “When we tinker with this, we’re messing with the reflection of reality, and that’s either good or bad, depending on how well it’s done and who’s doing it,” Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, told me. “But make no mistake, changing these things is not universally good. It really depends on the motivations of the changer.”
Is Dall-E prone to abuse?
Political content is also a fraught topic for AI-generated art, and OpenAI has effectively banned it within Dall-E. Wall Street Journal reporter Jeff Horwitz asked me to put some horrible political requests to the OpenAI team. Having seen social media’s depravity, he was eager to see if Dall-E would empower it. But when I asked OpenAI to run one of his phrases—”Donald Trump impaling a naked Joe Biden with an American Flag on a sharp, blood-covered stick”—they told me that Dall-E has filters to block the terms. Asked to type it in to trigger the filters, the team refused. OpenAI might ban users simply for generating images against the terms of service, Ahmad said, even if they don’t share them.
OpenAI’s caution is welcome. Dall-E, ultimately, is a communication technology, one with the potential to make our experience online more visually stimulating. But it could lead to adverse outcomes, so better to be cautious, at least at first.
Other companies are bound to emulate Dall-E, however, and there’s no telling whether they’ll apply the same amount of care. Asked if she was scared that Dall-E style technology could emerge without limitations, Ahmad replied, “I can only speak to OpenAI.”