Mode Collapse

The generator in a GAN learns to produce only a small subset of outputs that fool the discriminator, ignoring most of the true data distribution. Instead of generating diverse faces, it generates the same three faces with minor variations — it has “collapsed” onto a few modes of the distribution.

Intuition

Imagine a counterfeiter and a detective. The counterfeiter discovers that a specific type of $20 bill fools the detective reliably. Why would they bother learning to forge$ 50s and $100s? The counterfeiter’s incentive is to minimise detection, not to cover all denominations. Once they find a reliable trick, they exploit it.

That’s mode collapse. The generator’s loss only measures “does the discriminator think this is real?” — it doesn’t measure “are you covering the full diversity of real data?” If producing faces of 30-year-old white women fools the discriminator, the generator has no reason to also produce faces of elderly men or children. The discriminator will eventually catch on and reject those specific faces, but the generator just shifts to a different narrow subset rather than broadening its coverage.

The deeper issue is that the GAN objective has no explicit diversity term. The KL divergence $\text{KL}(p_g \| p_\text{data})$ that the vanilla GAN approximately minimises is mode-seeking: it prefers to place all probability mass on the highest-density modes of the target rather than spread it evenly. This is the opposite of the mode-covering behaviour of the KL in VAEs ( $\text{KL}(q \| p)$ ).

Manifestation

Generated samples lack diversity — visually, outputs look nearly identical or cycle through a small set of templates
The discriminator’s loss oscillates rather than converging — it catches the current mode, the generator shifts to another narrow mode, repeat
FID/IS metrics plateau or worsen even as generator loss decreases — the generator is “winning” against the discriminator without actually improving
Interpolation in latent space produces no smooth variation — different z vectors map to nearly the same output
Histogram of generated features shows sharp peaks where data should be spread (e.g., generated digits are 80% “1”s and “7”s)

Where It Appears

GANs (gans/): the defining failure mode — vanilla GAN and WGAN are both susceptible; Wasserstein distance helps by providing smoother gradients but doesn’t eliminate collapse entirely
Diffusion (diffusion/): mode collapse is largely why diffusion models replaced GANs for image generation — the denoising objective naturally covers all modes since every training example contributes a denoising target
Variational inference (variational-inference-vae/): the reverse phenomenon (posterior collapse) is structurally similar — the model finds a degenerate solution that satisfies the objective without using all its capacity
Contrastive learning (contrastive-self-supervising/): representation collapse is the self-supervised analogue — the encoder maps everything to the same point

Solutions at a Glance

Solution	Mechanism	Where documented
WGAN / WGAN-GP	Wasserstein distance provides gradients even when supports don’t overlap	`gans/`
Spectral normalisation	Constrains discriminator Lipschitz constant, stabilising training dynamics	`atomic-concepts/regularisation/spectral-normalisation.md`
Minibatch discrimination	Discriminator sees statistics across the batch, detecting low diversity	`gans/` (concept, not a variant)
Conditional GAN	Conditioning on class labels forces the generator to cover all classes	`gans/` (cGAN variant)
Diffusion models	Replace adversarial training entirely with a per-example denoising objective	`diffusion/`
Unrolled GANs	Generator optimises against future discriminator states, not just current	(Metz et al., 2017)

Historical Context

Mode collapse was recognised as a major GAN failure mode almost immediately after Goodfellow et al. introduced GANs in 2014. Early GAN training was notoriously unreliable — practitioners shared folklore recipes (careful learning rate ratios, progressive growing, etc.) more than principled solutions. The theoretical understanding came later: Arjovsky & Bottou (2017) showed that the Jensen-Shannon divergence used by vanilla GANs is flat when the generator and data distributions have non-overlapping supports (which is almost always the case in high dimensions), providing no useful gradient signal and pushing the generator toward mode-seeking behaviour. This analysis motivated the Wasserstein distance (WGAN), which remains one of the cleanest theoretical contributions to GAN training. Ultimately, mode collapse was a major motivation for the field’s shift toward diffusion models starting around 2020.