DreamBooth

[1][2][3] Pretrained text-to-image diffusion models, while often capable of offering a diverse range of different image output types, lack the specificity required to generate images of lesser-known subjects, and are limited in their ability to render known subjects in different situations and contexts.

[1] The methodology used to run implementations of DreamBooth involves the fine-tuning the full UNet component of the diffusion model using a few images (usually 3--5) depicting a specific subject.

As an example, a photograph of a [Nissan R34 GTR] car, with car being the class); a class-specific prior preservation loss is applied to encourage the model to generate diverse instances of the subject based on what the model is already trained on for the original class.

[4] The Stable Diffusion adaptation of DreamBooth in particular is released as a free and open-source project based on the technology outlined by the original paper published by Ruiz et.

[6] In addition, artists have expressed their apprehension regarding the ethics of using DreamBooth to train model checkpoints that are specifically aimed at imitating specific art styles associated with human artists; one such critic is Hollie Mengert, an illustrator for Disney and Penguin Random House who has had her art style trained into a checkpoint model via DreamBooth and shared online, without her consent.

Demonstration of the use of DreamBooth to fine-tune the Stable Diffusion v1.5 diffusion model, using training data obtained from Category:Jimmy Wales on Wikimedia Commons . Depicted here are algorithmically generated images of Jimmy Wales , co-founder of Wikipedia, performing bench press exercises at a fitness gym.