On 6 April 2022, OpenAI announced DALL-E 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles".
[6] On 20 July 2022, DALL-E 2 entered into a beta phase with invitations sent to 1 million waitlisted individuals;[7] users could generate a certain number of images for free every month and may purchase more.
[11] In September 2023, OpenAI announced their latest image model, DALL-E 3, capable of understanding "significantly more nuance and detail" than previous iterations.
[12] In early November 2022, OpenAI released DALL-E 2 as an API, allowing developers to integrate the model into their own applications.
[23] CLIP is a separate model based on contrastive learning that was trained on 400 million pairs of images with text captions scraped from the Internet.
[citation needed] DALL-E can produce images for a wide variety of arbitrary descriptions from various viewpoints[28] with only rare failures.
[15] Mark Riedl, an associate professor at the Georgia Tech School of Interactive Computing, found that DALL-E could blend concepts (described as a key element of human creativity).
DALL-E 2's "inpainting" and "outpainting" use context from an image to fill in missing areas using a medium consistent with the original, following a given prompt.
[38] DALL-E 2's reliance on public datasets influences its results and leads to algorithmic bias in some cases, such as generating higher numbers of men than women for requests that do not mention gender.
[38] DALL-E 2's training data was filtered to remove violent and sexual imagery, but this was found to increase bias in some cases such as reducing the frequency of women being generated.
[45][44] Another concern about DALL-E 2 and similar models is that they could cause technological unemployment for artists, photographers, and graphic designers due to their accuracy and popularity.
[12] In 2023 Microsoft pitched the United States Department of Defense to use DALL-E models to train battlefield management system.
DALL-E's output for "an illustration of a baby daikon radish in a tutu walking a dog" was mentioned in pieces from Input,[50] NBC,[51] Nature,[52] and other publications.
[23][30] ExtremeTech stated "you can ask DALL-E for a picture of a phone or vacuum cleaner from a specified period of time, and it understands how those objects have changed".
[27] According to MIT Technology Review, one of OpenAI's objectives was to "give language models a better grasp of the everyday concepts that humans use to make sense of things".
[23] Wall Street investors have had a positive reception of DALL-E 2, with some firms thinking it could represent a turning point for a future multi-trillion dollar industry.
AI-driven image generation tools have been heavily criticized by artists because they are trained on human-made art scraped from the web.
Over the first days of its launch, filtering was reportedly increased to the point where images generated by some of Bing's own suggested prompts were being blocked.