• Algorithms
  • Also: gradient vanishing in quantum circuits
  • Also: exponentially vanishing gradients

Barren Plateau

A barren plateau is a phenomenon in variational quantum algorithms where gradients of the cost function vanish exponentially with system size, making optimization with gradient-based methods infeasible.

Variational quantum algorithms work by adjusting a set of circuit parameters to minimize a cost function, using gradient-based optimization similar to training a classical neural network. The barren plateau problem is a fundamental obstacle: for many natural circuit architectures, the gradient of the cost function with respect to any single parameter becomes exponentially small as the number of qubits grows. In a barren plateau, the loss landscape is almost perfectly flat everywhere, and gradient-based optimizers stall completely because they cannot detect which direction leads downhill.

Why gradients vanish

The mathematical source of barren plateaus is related to how random quantum circuits explore the space of all unitary operations. When a circuit has enough layers and entanglement to approximate a random unitary (a property called forming a 2-design or approximate unitary t-design), the expectation value of any local observable becomes nearly the same everywhere in parameter space. The variance of the gradient with respect to a single parameter scales as O(2n)O(2^{-n}) for nn qubits. With 50 qubits, the gradient is roughly 101510^{-15}, far below numerical precision.

The original barren plateau result (McClean et al., 2018) applied to random circuit structures. Subsequent work showed that barren plateaus also arise from:

  • Global cost functions. Cost functions defined on all qubits simultaneously (such as overlap with a target state) vanish faster than local cost functions that only measure a few qubits at a time.
  • Noise-induced barren plateaus. Even for structured, non-random circuits, hardware noise exponentially contracts the effective state toward the maximally mixed state, flattening the landscape independently of circuit structure.
  • High entanglement. Circuits that rapidly generate high entanglement tend to produce barren plateaus even when not fully random.

Diagnosing and mitigating barren plateaus

Detecting a barren plateau requires estimating gradient variance across random parameter initializations. If variance scales exponentially with qubit count, the landscape is a barren plateau. This is computationally expensive but necessary before investing optimization effort.

Several mitigation strategies have been proposed:

  • Local cost functions. Replacing global cost functions with sums of terms acting on small subsystems preserves gradient magnitude exponentially longer. The trade-off is that local cost functions may have spurious local minima or fail to encode the target problem faithfully.
  • Structured initialization. Starting parameters near the identity (so the circuit begins as a near-trivial operation) can preserve gradient signal in early training. This is sometimes called identity initialization or layerwise training.
  • Layer-by-layer training. Training one layer at a time and freezing it before adding the next prevents the full circuit from immediately behaving like a random unitary.
  • Quantum natural gradient. Using the quantum Fisher information matrix to precondition gradient updates can improve optimization in shallow regimes, though it does not fundamentally eliminate barren plateaus.
  • Alternative optimizers. Gradient-free methods (Nelder-Mead, COBYLA, evolutionary strategies) can sometimes navigate flat landscapes by searching more broadly, but they scale poorly with parameter count.

Implications for quantum machine learning

Barren plateaus are a serious challenge for quantum neural networks and quantum kernel methods that rely on deep, expressive circuits. Many proposed quantum machine learning models require circuits deep enough to represent useful functions, but those same depths push the landscape toward a barren plateau. This has prompted debate about whether quantum machine learning can offer genuine advantage on classical data, or whether its most expressive models are fundamentally untrainable.

See also