The Style Generative Adversarial Network, or StyleGAN for short, is an extension to the GAN architecture introduced by Nvidia researchers in December 2018,[1] and made source available in February 2019.
[2][3] StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow,[4] or Meta AI's PyTorch, which supersedes TensorFlow as the official implementation library in later StyleGAN versions.
[6][7] Nvidia introduced StyleGAN3, described as an "alias-free" version, on June 23, 2021, and made source available on October 12, 2021.
[8] A direct predecessor of the StyleGAN series is the Progressive GAN, published in 2017.
[9] In December 2018, Nvidia researchers distributed a preprint with accompanying software introducing StyleGAN, a GAN for producing an unlimited number of (often convincing) portraits of fake human faces.
In February 2019, Uber engineer Phillip Wang used the software to create the website This Person Does Not Exist, which displayed a new face on each web page reload.
[13] The collection was made using a private dataset shot in a controlled environment with similar light and angles.
[14] Similarly, two faculty at the University of Washington's Information School used StyleGAN to create Which Face is Real?, which challenged visitors to differentiate between a fake and a real face side by side.
[11] The faculty stated the intention was to "educate the public" about the existence of this technology so they could be wary of it, "just like eventually most people were made aware that you can Photoshop an image".
[6][7] In 2021, a third version was released, improving consistency between fine and coarse details in the generator.
[16] In December 2019, Facebook took down a network of accounts with false identities, and mentioned that some of them had used profile pictures created with machine learning techniques.
To avoid discontinuity between stages of the GAN game, each new layer is "blended in" (Figure 2 of the paper[9]).
One, it applies the style latent vector to transform the convolution layer's weights instead, thus solving the "blob" problem.
[19] The "blob" problem roughly speaking is because using the style latent vector to normalize the generated image destroys useful information.
Two, it uses residual connections, which helps it avoid the phenomenon where certain features are stuck at intervals of pixels.
This was updated by the StyleGAN2-ADA ("ADA" stands for "adaptive"),[20] which uses invertible data augmentation.
It also tunes the amount of data augmentation applied by starting at zero, and gradually increasing it until an "overfitting heuristic" reaches a target level, thus the name "adaptive".
[22] They analyzed the problem by the Nyquist–Shannon sampling theorem, and argued that the layers in the generator learned to exploit the high-frequency signal in the pixels they operate upon.
To solve this, they proposed imposing strict lowpass filters between each generator's layers, so that the generator is forced to operate on the pixels in a way faithful to the continuous signals they represent, rather than operate on them as merely discrete signals.
The resulting StyleGAN-3 is able to generate images that rotate and translate smoothly, and without texture sticking.