Mode collapse

It occurs when the model produces outputs that are less diverse than expected, effectively "collapsing" to generate only a few modes of the data distribution while ignoring others.

This phenomenon undermines the goal of generative models to capture the full diversity of the training data.

[1][2] Common causes include:[3] Several GAN-specific strategies were developed to mitigate mode collapse: The large language models are usually trained in two steps.

In the first step ("pretraining"), the model is trained to simply generate text sampled from a large dataset.

More finetuning would result in higher average task performance, but less diverse outputs.