Zero-Noise Extrapolation: Qiskit vs Mitiq
Compare zero-noise extrapolation implementations in Qiskit Runtime and Mitiq. Learn gate folding, pulse stretching, Richardson extrapolation, and when each approach is preferable.
Circuit diagrams
What Is Zero-Noise Extrapolation?
Zero-noise extrapolation (ZNE) is an error mitigation technique that intentionally runs quantum circuits at several controlled noise levels, then extrapolates the results back to the zero-noise limit. Unlike quantum error correction, ZNE requires no ancilla qubits, no encoding overhead, and no detailed noise model. It works entirely at the level of classical post-processing.
The core insight: if you measure the expectation value of an observable at multiple noise strengths, you can fit a curve through those measurements and read off the y-intercept, which corresponds to zero noise.
ZNE addresses two distinct challenges:
- Noise scaling: how to amplify noise in a controlled way without changing the logical action of the circuit.
- Extrapolation: how to infer the zero-noise value from a small number of noisy data points, each with statistical uncertainty.
Both the Mitiq library (from Unitary Fund) and Qiskit Runtime (from IBM) implement ZNE, but they differ in control granularity, backend integration, and extensibility. This tutorial covers both in detail and helps you choose the right tool for your workflow.
Mathematical Foundation of ZNE
Understanding the math behind ZNE clarifies why the technique works, when it breaks, and how to choose extrapolation methods.
The Noise-Scaled Expectation Value
Let O be the observable you want to measure (for example, the Pauli operator ZZ). On a noiseless quantum computer, the expectation value is:
E_ideal = Tr[rho_ideal * O]
where rho_ideal = |psi><psi| is the pure state produced by the circuit. On real hardware, the state is mixed due to noise. We model the noise strength with a scale factor lambda, where lambda = 1 represents the natural hardware noise. The noisy expectation value is:
E(lambda) = Tr[rho(lambda) * O]
Our goal is to estimate E(0) = E_ideal, the expectation value at zero noise.
Taylor Series Expansion
Assume E(lambda) is analytic (smooth and differentiable). We can expand it as a Taylor series around lambda = 0:
E(lambda) = E_0 + c_1 * lambda + c_2 * lambda^2 + c_3 * lambda^3 + ...
where E_0 = E(0) is the noiseless value we want, and c_1, c_2, c_3, ... are unknown coefficients that depend on the noise channel and observable.
If we measure E(lambda) at several scale factors, we can set up a system of equations and solve for E_0.
Worked Example: Richardson Extrapolation with Three Scale Factors
Suppose we measure at scale factors lambda = 1, 3, 5. Truncating the Taylor series at second order:
E(1) = E_0 + c_1 * 1 + c_2 * 1 ... (equation 1)
E(3) = E_0 + c_1 * 3 + c_2 * 9 ... (equation 2)
E(5) = E_0 + c_1 * 5 + c_2 * 25 ... (equation 3)
This is a system of three equations in three unknowns (E_0, c_1, c_2). Solving by elimination:
Subtract equation 1 from equation 2:
E(3) - E(1) = 2*c_1 + 8*c_2 ... (equation 4)
Subtract equation 2 from equation 3:
E(5) - E(3) = 2*c_1 + 16*c_2 ... (equation 5)
Subtract equation 4 from equation 5:
E(5) - 2*E(3) + E(1) = 8*c_2
=> c_2 = [E(5) - 2*E(3) + E(1)] / 8
From equation 4:
c_1 = [E(3) - E(1) - 8*c_2] / 2
= [E(3) - E(1) - E(5) + 2*E(3) - E(1)] / 2
= [3*E(3) - 2*E(1) - E(5)] / 2
Substituting back into equation 1:
E_0 = E(1) - c_1 - c_2
= E(1) - [3*E(3) - 2*E(1) - E(5)] / 2 - [E(5) - 2*E(3) + E(1)] / 8
Simplifying:
E_0 = [15*E(1) - 10*E(3) + 3*E(5)] / 8
These are the Richardson extrapolation coefficients for scale factors [1, 3, 5]. The coefficients (15/8, -10/8, 3/8) sum to 1, as required for consistency (if noise were zero at all scale factors, E_0 = E).
Numerical Verification
Let’s verify with a concrete example. Suppose the true noise curve is:
E(lambda) = 0.95 - 0.05*lambda - 0.01*lambda^2
so that E_0 = 0.95. Then:
E(1) = 0.95 - 0.05 - 0.01 = 0.89
E(3) = 0.95 - 0.15 - 0.09 = 0.71
E(5) = 0.95 - 0.25 - 0.25 = 0.45
Applying the Richardson formula:
E_0 = (15 * 0.89 - 10 * 0.71 + 3 * 0.45) / 8
= (13.35 - 7.10 + 1.35) / 8
= 7.60 / 8
= 0.95 ✓
The result is exact because the true curve is a second-order polynomial and we used three data points (enough to cancel terms up to order 2). If the true curve has higher-order terms, the Richardson result contains residual bias proportional to those terms.
import numpy as np
# Verify Richardson extrapolation coefficients numerically
scale_factors = np.array([1.0, 3.0, 5.0])
# True noise curve: E(lambda) = 0.95 - 0.05*lambda - 0.01*lambda^2
E_true = lambda lam: 0.95 - 0.05 * lam - 0.01 * lam**2
measurements = np.array([E_true(s) for s in scale_factors])
print(f"E(1) = {measurements[0]:.4f}")
print(f"E(3) = {measurements[1]:.4f}")
print(f"E(5) = {measurements[2]:.4f}")
# Richardson coefficients for [1, 3, 5]
coeffs = np.array([15, -10, 3]) / 8.0
E_0_richardson = np.dot(coeffs, measurements)
print(f"\nRichardson extrapolation: E_0 = {E_0_richardson:.4f}")
print(f"Coefficients sum to: {np.sum(coeffs):.4f}")
# General Richardson coefficients via Vandermonde solve
V = np.vander(scale_factors, increasing=True)
e_0_vec = np.zeros(len(scale_factors))
e_0_vec[0] = 1.0 # We want the constant term
general_coeffs = np.linalg.solve(V.T, e_0_vec)
print(f"General coefficients: {general_coeffs}")
print(f"General E_0 = {np.dot(general_coeffs, measurements):.4f}")
Noise Scaling Methods
Gate Folding
Gate folding replaces each gate G with the sequence G * G† * G. Because G * G† = I (identity), the logical action is unchanged, but the physical noise triples: each of the three gate executions introduces its own error. One fold gives scale factor lambda = 3, two folds give lambda = 5, and so on.
For a circuit with gates G_1, G_2, ..., G_n, global folding at scale factor 3 produces:
G_1 G_1† G_1 G_2 G_2† G_2 ... G_n G_n† G_n
The total circuit depth triples, but the logical output is identical to the original circuit.
Partial Folding: Global, Left, Right, and Random
Global folding folds every gate in the circuit by the same amount. This is straightforward for odd integer scale factors (1, 3, 5, 7, …), but what about non-integer scale factors like 1.5 or 2.0?
Partial folding solves this by selecting a subset of gates to fold. For scale factor 1.5 on a 10-gate circuit, you fold 5 of the 10 gates once each. The total gate count goes from 10 to 15, giving an effective scale factor of 15/10 = 1.5.
Mitiq provides three partial folding strategies:
-
fold_global: Folds the entire circuit uniformly. For integer scale factors, every gate is folded the same number of times. For non-integer factors, the circuit is folded globally as many full times as possible, then the remaining gates are folded starting from the end of the circuit. -
fold_gates_from_left: Selects gates to fold starting from the beginning of the circuit. Useful when you know the early gates in your circuit are the noisiest (for example, state preparation gates on hardware with warm-up effects). -
fold_gates_from_right: Selects gates from the end of the circuit. Since later gates have less time for decoherence to wash out their errors, folding them can be more effective on T1/T2-limited hardware. -
fold_gates_at_random: Randomly selects which gates to fold. This avoids systematic bias from always folding the same gates and tends to produce more uniform noise amplification across the circuit. Recommended as a default unless you have reason to prefer a specific ordering.
from mitiq.zne.scaling import (
fold_global,
fold_gates_at_random,
fold_gates_from_left,
fold_gates_from_right,
)
import cirq
# Build a simple 2-qubit circuit
q0, q1 = cirq.LineQubit.range(2)
circuit = cirq.Circuit([
cirq.H(q0),
cirq.CNOT(q0, q1),
cirq.rz(0.5)(q0),
cirq.CNOT(q0, q1),
])
print("Original circuit:")
print(circuit)
print(f"Gate count: {len(list(circuit.all_operations()))}")
# Global fold at scale factor 3 (every gate folded once)
folded_global = fold_global(circuit, scale_factor=3.0)
print(f"\nGlobal fold (3x): {len(list(folded_global.all_operations()))} gates")
# Partial fold at scale factor 1.5 (half the gates folded)
folded_random = fold_gates_at_random(circuit, scale_factor=1.5)
print(f"Random fold (1.5x): {len(list(folded_random.all_operations()))} gates")
folded_left = fold_gates_from_left(circuit, scale_factor=1.5)
print(f"Left fold (1.5x): {len(list(folded_left.all_operations()))} gates")
folded_right = fold_gates_from_right(circuit, scale_factor=1.5)
print(f"Right fold (1.5x): {len(list(folded_right.all_operations()))} gates")
When to use each folding strategy:
| Strategy | Best for | Drawback |
|---|---|---|
fold_global | Uniform noise models | Cannot produce all fractional scale factors cleanly |
fold_gates_at_random | General-purpose (recommended default) | Non-deterministic, slight variance between runs |
fold_gates_from_left | Circuits where early gates are noisiest | Biased amplification |
fold_gates_from_right | Circuits where late gates are noisiest | Biased amplification |
Pulse Stretching
On backends that expose pulse-level control (such as IBM backends with the Qiskit Pulse API), you can stretch the duration of microwave pulses that implement gates. A 2x-stretched CNOT pulse takes twice as long, accumulating roughly twice the decoherence error. Pulse stretching has two advantages over gate folding:
- It produces continuous scale factors (not limited to odd integers or partial folding approximations).
- It amplifies the actual decoherence noise (T1, T2 processes) rather than replicating the full gate error.
The disadvantage is that pulse stretching requires backend-level access and calibration data. Not all hardware platforms support it.
ZNE in Mitiq with Qiskit Circuits
Mitiq supports Qiskit circuits natively through the mitiq.interface.mitiq_qiskit module. This section shows the complete workflow for Qiskit users.
Basic Mitiq-Qiskit Integration
The key requirement is an executor function: a callable that takes a quantum circuit and returns a single float (the expectation value). Mitiq handles the noise scaling and extrapolation; you handle the circuit execution.
import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from qiskit.quantum_info import SparsePauliOp, Statevector
from mitiq import zne
from mitiq.zne.inference import RichardsonFactory, LinearFactory
from mitiq.zne.scaling import fold_gates_at_random
def make_ghz_circuit(n_qubits: int) -> QuantumCircuit:
"""Create an n-qubit GHZ state circuit."""
qc = QuantumCircuit(n_qubits)
qc.h(0)
for i in range(n_qubits - 1):
qc.cx(i, i + 1)
return qc
def make_noisy_executor(noise_level: float = 0.01, shots: int = 8192):
"""Return an executor that simulates a Qiskit circuit with depolarizing noise.
The executor measures the ZZZZ expectation value (all-Z correlator) for a
4-qubit circuit. Mitiq calls this function at each noise scale factor.
"""
# Set up noise model: depolarizing error on all 2-qubit gates
noise_model = NoiseModel()
error_2q = depolarizing_error(noise_level, 2)
noise_model.add_all_qubit_quantum_error(error_2q, ["cx"])
simulator = AerSimulator(noise_model=noise_model)
def executor(circuit: QuantumCircuit) -> float:
# Add measurements to the circuit
circ = circuit.copy()
circ.measure_all()
# Transpile and run
t_circ = transpile(circ, simulator)
result = simulator.run(t_circ, shots=shots).result()
counts = result.get_counts()
# Compute <ZZZZ> expectation value from counts
n_qubits = circuit.num_qubits
exp_val = 0.0
for bitstring, count in counts.items():
# Parity of the bitstring: +1 if even number of 1s, -1 if odd
parity = (-1) ** bitstring.count("1")
exp_val += parity * count
exp_val /= shots
return exp_val
return executor
# Build the circuit
circuit = make_ghz_circuit(4)
print(circuit)
# Create a noisy executor (1% depolarizing noise on CX gates)
executor = make_noisy_executor(noise_level=0.01, shots=8192)
# Run without mitigation
unmitigated = executor(circuit)
print(f"Unmitigated <ZZZZ>: {unmitigated:.4f}")
# Run with ZNE (Richardson extrapolation)
factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
mitigated = zne.execute_with_zne(
circuit=circuit,
executor=executor,
factory=factory,
scale_noise=fold_gates_at_random,
)
print(f"ZNE mitigated <ZZZZ>: {mitigated:.4f}")
print(f"Ideal <ZZZZ>: 1.0000")
Important Notes on the Executor
- The executor must accept a single
QuantumCircuitargument and return a singlefloat. If you need additional parameters (noise level, shots, backend), use a closure orfunctools.partial. - The circuit Mitiq passes to the executor already has gates folded. Do not add extra folding inside the executor.
- The executor should include measurement and classical post-processing. Mitiq does not handle measurements for you.
- For Qiskit circuits, Mitiq converts them internally to Cirq for folding, then converts back. This conversion supports standard gates but may fail on custom gates or pulse-level instructions.
ZNE in Qiskit Runtime
Qiskit Runtime integrates ZNE directly into the Estimator primitive. No circuit modification code is needed; Runtime handles noise scaling internally using backend calibration data.
Basic Usage with EstimatorV2
from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2 as Estimator
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
# Build a test circuit
qc = QuantumCircuit(2)
qc.h(0)
qc.cx(0, 1)
observable = SparsePauliOp("ZZ")
service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=2)
with Session(backend=backend) as session:
# Resilience level 2 activates ZNE automatically
options = EstimatorOptions()
options.resilience_level = 2
estimator = Estimator(mode=session, options=options)
job = estimator.run([(qc, observable)])
result = job.result()
print(f"Qiskit Runtime ZNE result: {result[0].data.evs:.4f}")
Full ZNE Customization in Qiskit Runtime
The resilience_level = 2 setting activates ZNE with sensible defaults. You can override every ZNE parameter individually for full control.
from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2 as Estimator
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
qc = QuantumCircuit(4)
qc.h(0)
for i in range(3):
qc.cx(i, i + 1)
observable = SparsePauliOp("ZZZZ")
service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=4)
with Session(backend=backend) as session:
options = EstimatorOptions()
# Resilience level 2 enables ZNE; you can then fine-tune below
options.resilience_level = 2
# ZNE-specific settings
options.resilience.zne_mitigation = True
options.resilience.zne.noise_factors = [1, 3, 5]
# Extrapolator options:
# "linear" - first-order polynomial fit
# "quadratic" - second-order polynomial fit
# "cubic" - third-order polynomial fit
# "quartic" - fourth-order polynomial fit
# "exponential" - single exponential decay fit
# "double_exponential" - sum of two exponential decays
options.resilience.zne.extrapolator = "exponential"
# TREX (Twirled Readout Error eXtinction) is bundled with resilience_level >= 1
# It handles measurement errors so ZNE only needs to correct gate errors
options.resilience.measure_mitigation = True
estimator = Estimator(mode=session, options=options)
job = estimator.run([(qc, observable)])
result = job.result()
print(f"ZNE mitigated <ZZZZ>: {result[0].data.evs:.4f}")
print(f"Standard error: {result[0].data.stds:.4f}")
Resilience Levels and Their Relationship to ZNE
| Level | What it enables | ZNE active? |
|---|---|---|
| 0 | No mitigation | No |
| 1 | TREX (readout error mitigation) | No |
| 2 | TREX + ZNE | Yes |
At level 2, Runtime applies TREX first to correct measurement errors, then applies ZNE to correct gate errors. This layered approach is more effective than either technique alone.
ZNE with SamplerV2: Why It Does Not Apply
SamplerV2 returns probability distributions (raw counts or quasi-probabilities), not expectation values. ZNE is fundamentally an expectation-value technique: it extrapolates a scalar E(lambda) to E(0). There is no well-defined way to extrapolate an entire probability distribution to zero noise because each probability would need independent extrapolation and the results might not sum to 1.
If you need error-mitigated probability distributions, consider:
- M3 (Matrix-free Measurement Mitigation): corrects readout errors on the full distribution. Available via
mthreeor Qiskit Runtime’s TREX. - PEC (Probabilistic Error Cancellation): provides unbiased error mitigation for distributions, but requires a full noise model and has exponential sampling overhead.
For expectation values, use EstimatorV2 with ZNE. For distributions, use SamplerV2 with readout mitigation only.
Extrapolation Methods Compared
Different extrapolators make different assumptions about how noise scales. Choosing the right one matters.
Linear Extrapolation
Assumes E(lambda) = a + b * lambda. Requires only two data points. Has the lowest variance among all extrapolators because it uses the fewest parameters, but it introduces systematic bias if the true noise curve is nonlinear. Best for shallow circuits where the noise is weak and the quadratic correction is negligible.
Richardson Extrapolation
Polynomial interpolation that passes exactly through all data points. With n data points, it cancels noise terms up to order n - 1. Richardson extrapolation is exact when the true E(lambda) is a polynomial of degree n - 1 or less. For higher-order terms, the residual bias shrinks as you add more scale factors, but the variance grows (the coefficients alternate in sign and grow in magnitude).
Polynomial Fit (PolyFactory)
Fits a polynomial of specified degree to possibly more data points than the polynomial order. This is a least-squares fit rather than interpolation, so it smooths out shot noise better than Richardson. Use PolyFactory(scale_factors=[1, 2, 3, 4, 5], order=2) to fit a quadratic through five data points.
Exponential Fit (ExpFactory)
Assumes E(lambda) = A + B * exp(-C * lambda). This functional form is physically motivated: for circuits dominated by T2 decoherence, the expectation value decays exponentially with circuit duration (and therefore with noise scale). Often outperforms polynomial extrapolators on real hardware. Requires an estimate of the asymptote A (the value E(lambda) approaches at infinite noise). For Pauli observables, this is typically 0.
from mitiq.zne.inference import (
LinearFactory,
RichardsonFactory,
PolyFactory,
ExpFactory,
)
# Linear: 2 points, lowest variance, potential bias
linear = LinearFactory(scale_factors=[1.0, 3.0])
# Richardson: exact polynomial cancellation, n points cancel n-1 orders
richardson = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
# Polynomial least-squares: more points than order, smooths shot noise
poly2 = PolyFactory(scale_factors=[1.0, 2.0, 3.0, 4.0, 5.0], order=2)
# Exponential: physically motivated for decoherence-dominated noise
exp_factory = ExpFactory(scale_factors=[1.0, 3.0, 5.0], asymptote=0.0)
Summary Table
| Method | Assumption | Min. points | Variance | Bias | Best for |
|---|---|---|---|---|---|
| Linear | Linear in lambda | 2 | Lowest | High if nonlinear | Shallow circuits, quick estimates |
| Richardson | Polynomial of degree n-1 | n | Grows with n | Exact for degree <= n-1 | Moderate-depth circuits, when polynomial model holds |
| PolyFactory | Polynomial least-squares | > order | Medium | Smoothed | Many data points available |
| ExpFactory | Exponential decay | 3 | Medium | Low for T2-dominated noise | Real hardware, decoherence-limited circuits |
Choosing Scale Factors
The choice of scale factors significantly affects ZNE quality. Here are the key considerations.
Odd Integers for Gate Folding
Each gate fold replaces G with G G† G, tripling the gate count. This means one fold gives scale factor 3, two folds give scale factor 5, and so on. The achievable integer scale factors for gate folding are 1, 3, 5, 7, 9, ... (odd integers). Even integers like 2 or 4 are not achievable with global folding but can be approximated with partial folding.
For simplicity and reproducibility, prefer odd integer scale factors: [1, 3, 5] is a standard choice.
Spanning the Right Range
Scale factors should span a range where the expectation value visibly changes but has not hit the noise floor. If the signal is already zero at scale factor 3, using [1, 3, 5] gives you two noise-floor measurements that carry no useful information.
Diagnostic procedure: Run at [1, 2, 3, 4, 5] (using partial folding for even values) and plot E vs. scale factor. Identify the scale factor where the signal approaches zero or the noise floor. Only use scale factors below that threshold.
import numpy as np
import matplotlib.pyplot as plt
from mitiq import zne
from mitiq.zne.scaling import fold_gates_at_random
# Diagnostic: sweep scale factors to find useful range
test_scale_factors = [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
expectations = []
for sf in test_scale_factors:
# Fold the circuit and measure (using your executor)
folded = fold_gates_at_random(circuit, scale_factor=sf)
exp_val = executor(folded) # your noisy executor
expectations.append(exp_val)
plt.figure(figsize=(8, 5))
plt.plot(test_scale_factors, expectations, "o-", label="Measured E(lambda)")
plt.axhline(y=0, color="gray", linestyle="--", alpha=0.5, label="Noise floor")
plt.xlabel("Scale factor (lambda)")
plt.ylabel("Expectation value")
plt.title("ZNE Diagnostic: Signal vs. Noise Scale")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Rule of thumb: use scale factors where E(lambda) is still above
# ~20% of the unmitigated value E(1)
threshold = 0.2 * expectations[0]
usable = [sf for sf, e in zip(test_scale_factors, expectations) if abs(e) > threshold]
print(f"Usable scale factors: {usable}")
Pulse Stretching: Continuous Values
For pulse stretching, you are not limited to odd integers. Any positive real number works as a scale factor, including [1.0, 1.5, 2.0, 2.5, 3.0]. Continuous scale factors allow finer sampling of the noise curve and better fits.
Statistical Variance of ZNE
ZNE amplifies both the signal and the statistical noise. Understanding this trade-off is essential for setting shot budgets.
Shot Noise at Each Scale Factor
For a Pauli observable with expectation value E, the standard deviation of the sample mean from N shots is:
sigma = sqrt((1 - E^2) / N)
At higher noise levels, |E| decreases, so 1 - E^2 increases, and the shot noise grows. For example, with N = 1000 shots:
| Scale factor | E(lambda) | sigma |
|---|---|---|
| 1 (unmitigated) | 0.90 | sqrt(1 - 0.81) / sqrt(1000) = 0.0138 |
| 3 | 0.70 | sqrt(1 - 0.49) / sqrt(1000) = 0.0226 |
| 5 | 0.45 | sqrt(1 - 0.2025) / sqrt(1000) = 0.0282 |
The shot noise at scale factor 5 is roughly twice the shot noise at scale factor 1.
Uncertainty Propagation for Linear ZNE
For linear extrapolation with two data points at scale factors lambda_1 = 1 and lambda_2 = 3:
E_0 = (lambda_2 * E(lambda_1) - lambda_1 * E(lambda_2)) / (lambda_2 - lambda_1)
= (3 * E(1) - 1 * E(3)) / 2
The variance of the extrapolated value is:
Var(E_0) = (lambda_2 / (lambda_2 - lambda_1))^2 * Var(E(lambda_1))
+ (lambda_1 / (lambda_2 - lambda_1))^2 * Var(E(lambda_2))
= (3/2)^2 * sigma_1^2 + (1/2)^2 * sigma_2^2
= 2.25 * sigma_1^2 + 0.25 * sigma_2^2
Plugging in the values from the table above:
Var(E_0) = 2.25 * (0.0138)^2 + 0.25 * (0.0226)^2
= 2.25 * 0.000190 + 0.25 * 0.000511
= 0.000428 + 0.000128
= 0.000556
sigma(E_0) = sqrt(0.000556) = 0.0236
Compare this to the unmitigated standard deviation of 0.0138. ZNE reduces the bias (from ~0.10 to ~0.01) but increases the variance by a factor of ~1.7. This is the fundamental bias-variance trade-off of ZNE.
Practical Implication: Shot Budgets
To maintain the same statistical precision after ZNE, you need more shots. A rough rule: if the extrapolation coefficients amplify the variance by factor k, you need k times as many shots per circuit execution. For Richardson extrapolation with [1, 3, 5], the variance amplification is typically 3x to 10x, depending on the coefficient magnitudes.
import numpy as np
def zne_variance_amplification(scale_factors, noise_curve, n_shots):
"""Compute the variance amplification factor for Richardson ZNE.
Args:
scale_factors: List of noise scale factors.
noise_curve: Callable, E(lambda).
n_shots: Number of shots per circuit.
Returns:
Ratio of ZNE variance to unmitigated variance.
"""
# Richardson coefficients via Vandermonde system
n = len(scale_factors)
V = np.vander(scale_factors, increasing=True)
e_0_vec = np.zeros(n)
e_0_vec[0] = 1.0
coeffs = np.linalg.solve(V.T, e_0_vec)
# Variance at each scale factor
variances = []
for sf in scale_factors:
E = noise_curve(sf)
var = (1 - E**2) / n_shots
variances.append(var)
# Propagated variance
zne_var = sum(c**2 * v for c, v in zip(coeffs, variances))
# Unmitigated variance
E1 = noise_curve(scale_factors[0])
unmitigated_var = (1 - E1**2) / n_shots
return zne_var / unmitigated_var
# Example
noise_curve = lambda lam: 0.95 * np.exp(-0.1 * lam)
amplification = zne_variance_amplification([1, 3, 5], noise_curve, 1000)
print(f"Variance amplification factor: {amplification:.1f}x")
print(f"Need ~{amplification:.0f}x more shots to match unmitigated precision")
Benchmark: ZNE on a 4-Qubit GHZ State
This section provides a complete, runnable benchmark comparing unmitigated results, linear ZNE, and Richardson ZNE on a concrete circuit.
Setup
- Circuit: 4-qubit GHZ state (H on qubit 0, then CNOT chain)
- Observable: ZZZZ (all-Z correlator)
- Ideal value: +1.0 (the GHZ state
(|0000> + |1111>) / sqrt(2)has<ZZZZ> = 1) - Noise model: 1% depolarizing noise on every CX gate
- Shots: 8192 per circuit execution
import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from mitiq import zne
from mitiq.zne.inference import RichardsonFactory, LinearFactory
from mitiq.zne.scaling import fold_gates_at_random
def make_ghz4() -> QuantumCircuit:
"""4-qubit GHZ state circuit."""
qc = QuantumCircuit(4)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.cx(2, 3)
return qc
def make_executor(noise_level: float, shots: int):
"""Create an executor that measures <ZZZZ> on a noisy simulator."""
noise_model = NoiseModel()
noise_model.add_all_qubit_quantum_error(
depolarizing_error(noise_level, 2), ["cx"]
)
sim = AerSimulator(noise_model=noise_model)
def executor(circuit: QuantumCircuit) -> float:
circ = circuit.copy()
circ.measure_all()
t_circ = transpile(circ, sim)
result = sim.run(t_circ, shots=shots).result()
counts = result.get_counts()
exp_val = 0.0
for bitstring, count in counts.items():
parity = (-1) ** bitstring.count("1")
exp_val += parity * count
return exp_val / shots
return executor
# Configuration
noise_level = 0.01
shots = 8192
circuit = make_ghz4()
executor = make_executor(noise_level, shots)
# 1. Unmitigated
unmitigated = executor(circuit)
print(f"Unmitigated <ZZZZ>: {unmitigated:.4f} (error: {abs(1.0 - unmitigated):.4f})")
# 2. ZNE with linear extrapolation
linear_factory = LinearFactory(scale_factors=[1.0, 3.0])
mitigated_linear = zne.execute_with_zne(
circuit=circuit,
executor=executor,
factory=linear_factory,
scale_noise=fold_gates_at_random,
)
print(f"ZNE (linear) <ZZZZ>: {mitigated_linear:.4f} (error: {abs(1.0 - mitigated_linear):.4f})")
# 3. ZNE with Richardson extrapolation
richardson_factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
mitigated_richardson = zne.execute_with_zne(
circuit=circuit,
executor=executor,
factory=richardson_factory,
scale_noise=fold_gates_at_random,
)
print(f"ZNE (Richardson) <ZZZZ>: {mitigated_richardson:.4f} (error: {abs(1.0 - mitigated_richardson):.4f})")
print(f"\nIdeal <ZZZZ>: 1.0000")
Expected Results
With 1% depolarizing noise on each of the three CX gates, typical results are:
| Method | <ZZZZ> | Residual error |
|---|---|---|
| Unmitigated | ~0.72 | ~0.28 |
| Linear ZNE | ~0.94 | ~0.06 |
| Richardson ZNE | ~0.97 | ~0.03 |
Richardson outperforms linear because the true noise curve has significant quadratic contribution at 1% per CX. The residual error in Richardson comes from higher-order terms and shot noise.
Clifford Data Regression (CDR): A Complement to ZNE
Mitiq implements Clifford Data Regression (CDR), a technique that complements ZNE by learning a noise model from “training” circuits whose ideal outputs are efficiently computable.
How CDR Works
- Generate training circuits: Replace non-Clifford gates (like T and Rz(theta)) in your target circuit with nearby Clifford gates. The resulting “near-Clifford” circuits can be simulated classically in polynomial time.
- Collect training data: Run each training circuit on the noisy hardware to get noisy expectation values. Also compute the ideal expectation values classically.
- Fit a correction model: Learn a linear (or nonlinear) mapping from noisy to ideal expectation values using the training data.
- Apply to target: Run the original circuit on noisy hardware, then apply the learned correction.
CDR in Mitiq
from mitiq import cdr
from qiskit import QuantumCircuit
import numpy as np
def ideal_executor(circuit: QuantumCircuit) -> float:
"""Compute the ideal (noiseless) expectation value via statevector simulation."""
from qiskit_aer import AerSimulator
sim = AerSimulator(method="statevector")
circ = circuit.copy()
circ.save_statevector()
result = sim.run(circ).result()
sv = result.get_statevector()
# Compute <ZZZZ> from statevector
n_qubits = circuit.num_qubits
probs = np.abs(sv.data) ** 2
exp_val = 0.0
for i, p in enumerate(probs):
bitstring = format(i, f"0{n_qubits}b")
parity = (-1) ** bitstring.count("1")
exp_val += parity * p
return exp_val
# noisy_executor is the same as in the ZNE example above
# (accepts a QuantumCircuit, returns a float)
# Run CDR
mitigated_cdr = cdr.execute_with_cdr(
circuit=circuit,
executor=executor, # noisy executor from earlier
simulator=ideal_executor, # noiseless simulation for training data
num_training_circuits=20,
fraction_non_clifford=0.1, # Replace 10% of non-Clifford gates
)
print(f"CDR mitigated <ZZZZ>: {mitigated_cdr:.4f}")
When CDR Beats ZNE
| Scenario | ZNE | CDR |
|---|---|---|
| Noise is predominantly incoherent (depolarizing, T1/T2) | Good | Good |
| Noise has coherent components (systematic over/under-rotation) | Poor (non-monotonic) | Better (learns the systematic pattern) |
| Circuit has few non-Clifford gates | Overkill | Ideal (many training circuits available) |
| Circuit is all Clifford gates | Works but unnecessary | Cannot generate training data (no gates to replace) |
| No classical simulator available | Works | Cannot work (needs ideal executor) |
CDR requires a classical simulator for the training circuits. For circuits with many non-Clifford gates, generating good training circuits becomes harder because the replacement changes the circuit significantly.
When ZNE Fails: Concrete Failure Modes
ZNE is not a universal fix. Knowing when it fails saves you from silently getting wrong answers.
Failure Mode 1: Non-Monotonic Noise
ZNE assumes that increasing the noise scale factor always pushes E(lambda) closer to the noise floor (typically 0 for Pauli observables). This assumption breaks when coherent errors are present.
Coherent errors (systematic gate over-rotations, for example) accumulate deterministically. Folding a gate three times triples the coherent error angle, which can cause the expectation value to oscillate rather than decay monotonically.
Example: Suppose a CX gate has a coherent Z-rotation error of angle theta. After folding to scale factor 3, the accumulated angle is 3*theta. If 3*theta crosses pi, the expectation value flips sign, and the extrapolation curve is non-monotonic.
Diagnosis: Plot E(lambda) vs. lambda for several scale factors. If the curve is not monotonically decreasing (in absolute value), ZNE extrapolation will be unreliable.
import numpy as np
# Simulated non-monotonic noise curve (coherent error)
def coherent_noise_curve(lam):
"""E(lambda) with both coherent and incoherent components."""
incoherent = 0.95 * np.exp(-0.05 * lam) # Decays monotonically
coherent = np.cos(0.3 * lam) # Oscillates
return incoherent * coherent
lambdas = np.linspace(1, 7, 50)
values = [coherent_noise_curve(l) for l in lambdas]
# Check for non-monotonicity
diffs = np.diff(values)
sign_changes = np.sum(np.diff(np.sign(diffs)) != 0)
print(f"Sign changes in E(lambda): {sign_changes}")
if sign_changes > 0:
print("WARNING: Non-monotonic noise curve detected. ZNE may be unreliable.")
Failure Mode 2: Overfolding
At high scale factors (7x, 9x, or above), the circuit noise may be so large that the expectation value is indistinguishable from zero within shot noise. These data points contribute only noise to the extrapolation, inflating the variance without improving the bias correction.
Rule of thumb: If |E(lambda)| < 2 * sigma(lambda) (the signal-to-noise ratio drops below 2), that scale factor is not useful. Drop it.
Failure Mode 3: Wrong Extrapolator
Using linear extrapolation when the true curve is exponential introduces a systematic bias that does not decrease with more shots. The linear fit overshoots or undershoots the zero-noise intercept depending on the curvature.
Diagnosis: Run at four or more scale factors. Fit both linear and exponential models. Compare the reduced chi-squared (chi^2 / dof) of each fit. If the linear fit has chi^2/dof >> 1, the model is wrong.
import numpy as np
from scipy.optimize import curve_fit
# Suppose you have measurements at these scale factors
scale_factors = np.array([1.0, 3.0, 5.0, 7.0])
measurements = np.array([0.85, 0.62, 0.41, 0.25])
uncertainties = np.array([0.015, 0.020, 0.025, 0.030])
# Linear fit
def linear_model(x, a, b):
return a + b * x
popt_lin, pcov_lin = curve_fit(linear_model, scale_factors, measurements,
sigma=uncertainties, absolute_sigma=True)
residuals_lin = measurements - linear_model(scale_factors, *popt_lin)
chi2_lin = np.sum((residuals_lin / uncertainties) ** 2)
dof_lin = len(measurements) - 2
print(f"Linear fit: E(0) = {popt_lin[0]:.4f}, chi2/dof = {chi2_lin/dof_lin:.2f}")
# Exponential fit
def exp_model(x, a, b, c):
return a + b * np.exp(-c * x)
popt_exp, pcov_exp = curve_fit(exp_model, scale_factors, measurements,
sigma=uncertainties, absolute_sigma=True,
p0=[0, 1, 0.1])
residuals_exp = measurements - exp_model(scale_factors, *popt_exp)
chi2_exp = np.sum((residuals_exp / uncertainties) ** 2)
dof_exp = len(measurements) - 3
print(f"Exp fit: E(0) = {popt_exp[0] + popt_exp[1]:.4f}, chi2/dof = {chi2_exp/dof_exp:.2f}")
# If chi2/dof >> 1 for linear but ~1 for exponential, use exponential
Failure Mode 4: Readout-Dominated Errors
ZNE mitigates gate errors by amplifying gate noise through folding. It does not amplify or correct readout (measurement) errors. If your circuit has high readout error but low gate error, ZNE will not help. In fact, the folded circuits have the same readout error as the original, so the extrapolation “corrects” a noise component (readout) that does not scale with lambda, leading to incorrect results.
Solution: Apply readout error mitigation (TREX, M3, or matrix inversion) before or alongside ZNE. Qiskit Runtime does this automatically at resilience level 2.
Common Mistakes
Mistake 1: ZNE on Exact Simulation
Running mitiq.zne.execute_with_zne with a noiseless simulator (shots=0 or statevector simulation) is pointless. ZNE estimates the zero-noise limit from noisy data. If the data is already noiseless, ZNE returns the same value (at best) or introduces numerical artifacts (at worst). Always test ZNE with a noisy simulator or real hardware.
Mistake 2: Even Scale Factors with Global Gate Folding
Gate folding produces scale factors of 1, 3, 5, 7, … (odd integers). Requesting scale_factors=[1, 2, 4] with fold_global will fail or silently round to achievable values. Use partial folding (fold_gates_at_random) for non-odd-integer scale factors, or stick to odd integers with fold_global.
# Wrong: even scale factors with global folding
# factory = RichardsonFactory(scale_factors=[1.0, 2.0, 4.0])
# zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_global) # Error!
# Correct option 1: odd integers with global folding
factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
result = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_global)
# Correct option 2: any scale factors with partial folding
factory = RichardsonFactory(scale_factors=[1.0, 2.0, 3.0])
result = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_gates_at_random)
Mistake 3: Ignoring Readout Error
Applying ZNE to a circuit where readout error dominates (common on superconducting hardware with 1-5% readout error per qubit) gives misleading results. ZNE cannot correct readout errors because folding does not amplify them.
Fix: Apply TREX or M3 readout mitigation first, then apply ZNE to the readout-corrected expectation values. Qiskit Runtime’s resilience level 2 does this automatically. In Mitiq, you must handle readout correction yourself inside the executor function.
Mistake 4: Not Clipping Extrapolated Values
After extrapolation, the estimated E_0 may fall outside the physical range [-1, +1] for Pauli observables due to shot noise. For example, Richardson extrapolation with noisy data might return E_0 = 1.05. Always clip to physical bounds:
mitigated = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_gates_at_random)
# Clip to physical range for a Pauli observable
mitigated_clipped = np.clip(mitigated, -1.0, 1.0)
This is especially important when the mitigated value feeds into downstream computations (like VQE energy estimation) that assume physical bounds.
Mitiq vs. Qiskit Runtime ZNE: Comparison Table
| Feature | Mitiq | Qiskit Runtime |
|---|---|---|
| Gate selection control | Full: fold specific gates, specific qubits, or random subsets | Limited: Runtime chooses folding strategy |
| Supported backends | Any (Cirq, Qiskit, Braket, pyQuil via converters) | IBM Quantum hardware only |
| Scale factor flexibility | Arbitrary real numbers via partial folding | Odd integers (gate folding internally) |
| Extrapolation methods | Linear, Richardson, Polynomial, Exponential, custom | Linear, quadratic, cubic, quartic, exponential, double exponential |
| Custom extrapolators | Yes (subclass Factory) | No |
| Integration with other mitigation | PEC, CDR, DDD composable in pipelines | TREX bundled automatically at resilience >= 1 |
| Readout mitigation | Manual (implement in executor) | Automatic (TREX at resilience >= 1) |
| Access to intermediate results | Yes (factory stores all scale factor results) | No (only final mitigated value returned) |
| Ease of use | Moderate (write executor, choose factory and folding) | Low (set resilience_level = 2) |
| Cost model | Pay for each circuit execution (3 scale factors = 3x cost) | Same (3 scale factors = 3x cost), but Runtime may optimize batching |
| Classical overhead | Minimal | Minimal |
| Open source | Yes (Apache 2.0) | Runtime client is open source; server is proprietary |
| Documentation | Extensive tutorials, API docs, and examples | Good API docs, fewer tutorials on ZNE internals |
When to Choose Mitiq
- You need fine-grained control over which gates are folded.
- You work with non-IBM hardware (IonQ, Rigetti, neutral atoms).
- You want to compose ZNE with PEC or CDR in a single pipeline.
- You need access to the intermediate expectation values at each scale factor for diagnostics.
- You are prototyping mitigation strategies in simulation before deploying to hardware.
When to Choose Qiskit Runtime
- You are already using IBM Quantum hardware through Qiskit Runtime.
- You want the simplest possible integration: one option flag activates ZNE with sensible defaults.
- You benefit from Runtime’s backend characterization data for more accurate noise scaling.
- You want TREX readout mitigation bundled automatically with ZNE.
Using Both Together
It is possible to use Mitiq’s ZNE on circuits that you eventually run on IBM hardware. You write the executor to submit jobs through Qiskit Runtime (with resilience_level = 0 to avoid double-mitigation), and let Mitiq handle the folding and extrapolation. This gives you Mitiq’s flexibility with IBM’s hardware access:
from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit.quantum_info import SparsePauliOp
from mitiq import zne
from mitiq.zne.inference import RichardsonFactory
from mitiq.zne.scaling import fold_gates_at_random
def make_ibm_executor(session, observable):
"""Create an executor that runs on IBM hardware via Qiskit Runtime."""
options = EstimatorOptions()
options.resilience_level = 0 # No Runtime mitigation; Mitiq handles ZNE
estimator = EstimatorV2(mode=session, options=options)
def executor(circuit):
job = estimator.run([(circuit, observable)])
result = job.result()
return float(result[0].data.evs)
return executor
# Usage (requires IBM Quantum credentials)
# service = QiskitRuntimeService()
# backend = service.least_busy(operational=True, simulator=False, min_num_qubits=4)
# observable = SparsePauliOp("ZZZZ")
#
# with Session(backend=backend) as session:
# executor = make_ibm_executor(session, observable)
# factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
# mitigated = zne.execute_with_zne(
# circuit=circuit,
# executor=executor,
# factory=factory,
# scale_noise=fold_gates_at_random,
# )
# print(f"Mitiq ZNE on IBM hardware: {mitigated:.4f}")
Limitations of ZNE
ZNE has fundamental limitations beyond the specific failure modes discussed above.
Circuit depth ceiling: For circuits with more than roughly 50-100 two-qubit gates on current hardware, the noise at scale factor 1 is already severe. Folding to 3x or 5x pushes the signal into the noise floor, making extrapolation impossible. ZNE is most effective on moderate-depth circuits where the unmitigated expectation value is still meaningfully different from zero.
No free lunch on variance: ZNE reduces bias at the cost of increased variance. The Richardson coefficients for [1, 3, 5] are (15/8, -10/8, 3/8), which amplify shot noise by the sum of squared coefficients divided by the number of scale factors. For high-precision results, you need substantially more shots than the unmitigated case.
Assumes gate noise dominates: ZNE corrects gate errors only. Readout errors, crosstalk, and leakage errors are not amplified by gate folding, so they appear as a constant offset that ZNE misinterprets as part of the zero-noise value. Always combine ZNE with readout mitigation.
Model dependence: The extrapolated value depends on the choice of extrapolator. If you use a linear model but the true curve is exponential, the result is biased regardless of how many shots you take. There is no fully model-independent way to extrapolate.
Not composable with all techniques: ZNE and PEC (Probabilistic Error Cancellation) address the same problem (gate errors) with different trade-offs. Applying both simultaneously is redundant and can degrade performance. Choose one based on your noise model knowledge and sampling budget.
Summary
ZNE is the most accessible error mitigation technique available today: it works on any hardware, requires no noise model, and integrates into existing workflows with minimal code changes. The choice between Mitiq and Qiskit Runtime comes down to how much control you need.
Start with Qiskit Runtime’s resilience_level = 2 for quick experiments on IBM hardware. Move to Mitiq when you need custom folding strategies, non-IBM backends, or composable mitigation pipelines. In either case, always validate with the diagnostic sweep (plot E vs. scale factor), choose your extrapolator based on the shape of the noise curve, and budget enough shots to handle the variance amplification.
Was this tutorial helpful?