Qiskit vs Mitiq for Zero-Noise Extrapolation: Which to Use

What Is Zero-Noise Extrapolation?

Zero-noise extrapolation (ZNE) is an error mitigation technique that intentionally runs quantum circuits at several controlled noise levels, then extrapolates the results back to the zero-noise limit. Unlike quantum error correction, ZNE requires no ancilla qubits, no encoding overhead, and no detailed noise model. It works entirely at the level of classical post-processing.

The core insight: if you measure the expectation value of an observable at multiple noise strengths, you can fit a curve through those measurements and read off the y-intercept, which corresponds to zero noise.

ZNE addresses two distinct challenges:

Noise scaling: how to amplify noise in a controlled way without changing the logical action of the circuit.
Extrapolation: how to infer the zero-noise value from a small number of noisy data points, each with statistical uncertainty.

Both the Mitiq library (from Unitary Fund) and Qiskit Runtime (from IBM) implement ZNE, but they differ in control granularity, backend integration, and extensibility. This tutorial covers both in detail and helps you choose the right tool for your workflow.

Mathematical Foundation of ZNE

Understanding the math behind ZNE clarifies why the technique works, when it breaks, and how to choose extrapolation methods.

The Noise-Scaled Expectation Value

Let O be the observable you want to measure (for example, the Pauli operator ZZ). On a noiseless quantum computer, the expectation value is:

E_ideal = Tr[rho_ideal * O]

where rho_ideal = |psi><psi| is the pure state produced by the circuit. On real hardware, the state is mixed due to noise. We model the noise strength with a scale factor lambda, where lambda = 1 represents the natural hardware noise. The noisy expectation value is:

E(lambda) = Tr[rho(lambda) * O]

Our goal is to estimate E(0) = E_ideal, the expectation value at zero noise.

Taylor Series Expansion

Assume E(lambda) is analytic (smooth and differentiable). We can expand it as a Taylor series around lambda = 0:

E(lambda) = E_0 + c_1 * lambda + c_2 * lambda^2 + c_3 * lambda^3 + ...

where E_0 = E(0) is the noiseless value we want, and c_1, c_2, c_3, ... are unknown coefficients that depend on the noise channel and observable.

If we measure E(lambda) at several scale factors, we can set up a system of equations and solve for E_0.

Worked Example: Richardson Extrapolation with Three Scale Factors

Suppose we measure at scale factors lambda = 1, 3, 5. Truncating the Taylor series at second order:

E(1) = E_0 + c_1 * 1 + c_2 * 1    ... (equation 1)
E(3) = E_0 + c_1 * 3 + c_2 * 9    ... (equation 2)
E(5) = E_0 + c_1 * 5 + c_2 * 25   ... (equation 3)

This is a system of three equations in three unknowns (E_0, c_1, c_2). Solving by elimination:

Subtract equation 1 from equation 2:

E(3) - E(1) = 2*c_1 + 8*c_2    ... (equation 4)

Subtract equation 2 from equation 3:

E(5) - E(3) = 2*c_1 + 16*c_2   ... (equation 5)

Subtract equation 4 from equation 5:

E(5) - 2*E(3) + E(1) = 8*c_2
=> c_2 = [E(5) - 2*E(3) + E(1)] / 8

From equation 4:

c_1 = [E(3) - E(1) - 8*c_2] / 2
    = [E(3) - E(1) - E(5) + 2*E(3) - E(1)] / 2
    = [3*E(3) - 2*E(1) - E(5)] / 2

Substituting back into equation 1:

E_0 = E(1) - c_1 - c_2
    = E(1) - [3*E(3) - 2*E(1) - E(5)] / 2 - [E(5) - 2*E(3) + E(1)] / 8

Simplifying:

E_0 = [15*E(1) - 10*E(3) + 3*E(5)] / 8

These are the Richardson extrapolation coefficients for scale factors [1, 3, 5]. The coefficients (15/8, -10/8, 3/8) sum to 1, as required for consistency (if noise were zero at all scale factors, E_0 = E).

Numerical Verification

Let’s verify with a concrete example. Suppose the true noise curve is:

E(lambda) = 0.95 - 0.05*lambda - 0.01*lambda^2

so that E_0 = 0.95. Then:

E(1) = 0.95 - 0.05 - 0.01 = 0.89
E(3) = 0.95 - 0.15 - 0.09 = 0.71
E(5) = 0.95 - 0.25 - 0.25 = 0.45

Applying the Richardson formula:

E_0 = (15 * 0.89 - 10 * 0.71 + 3 * 0.45) / 8
    = (13.35 - 7.10 + 1.35) / 8
    = 7.60 / 8
    = 0.95  ✓

The result is exact because the true curve is a second-order polynomial and we used three data points (enough to cancel terms up to order 2). If the true curve has higher-order terms, the Richardson result contains residual bias proportional to those terms.

import numpy as np

# Verify Richardson extrapolation coefficients numerically
scale_factors = np.array([1.0, 3.0, 5.0])

# True noise curve: E(lambda) = 0.95 - 0.05*lambda - 0.01*lambda^2
E_true = lambda lam: 0.95 - 0.05 * lam - 0.01 * lam**2

measurements = np.array([E_true(s) for s in scale_factors])
print(f"E(1) = {measurements[0]:.4f}")
print(f"E(3) = {measurements[1]:.4f}")
print(f"E(5) = {measurements[2]:.4f}")

# Richardson coefficients for [1, 3, 5]
coeffs = np.array([15, -10, 3]) / 8.0
E_0_richardson = np.dot(coeffs, measurements)
print(f"\nRichardson extrapolation: E_0 = {E_0_richardson:.4f}")
print(f"Coefficients sum to: {np.sum(coeffs):.4f}")

# General Richardson coefficients via Vandermonde solve
V = np.vander(scale_factors, increasing=True)
e_0_vec = np.zeros(len(scale_factors))
e_0_vec[0] = 1.0  # We want the constant term
general_coeffs = np.linalg.solve(V.T, e_0_vec)
print(f"General coefficients: {general_coeffs}")
print(f"General E_0 = {np.dot(general_coeffs, measurements):.4f}")

Noise Scaling Methods

Gate Folding

Gate folding replaces each gate G with the sequence G * G† * G. Because G * G† = I (identity), the logical action is unchanged, but the physical noise triples: each of the three gate executions introduces its own error. One fold gives scale factor lambda = 3, two folds give lambda = 5, and so on.

For a circuit with gates G_1, G_2, ..., G_n, global folding at scale factor 3 produces:

G_1 G_1† G_1  G_2 G_2† G_2  ...  G_n G_n† G_n

The total circuit depth triples, but the logical output is identical to the original circuit.

Partial Folding: Global, Left, Right, and Random

Global folding folds every gate in the circuit by the same amount. This is straightforward for odd integer scale factors (1, 3, 5, 7, …), but what about non-integer scale factors like 1.5 or 2.0?

Partial folding solves this by selecting a subset of gates to fold. For scale factor 1.5 on a 10-gate circuit, you fold 5 of the 10 gates once each. The total gate count goes from 10 to 15, giving an effective scale factor of 15/10 = 1.5.

Mitiq provides three partial folding strategies:

fold_global: Folds the entire circuit uniformly. For integer scale factors, every gate is folded the same number of times. For non-integer factors, the circuit is folded globally as many full times as possible, then the remaining gates are folded starting from the end of the circuit.
fold_gates_from_left: Selects gates to fold starting from the beginning of the circuit. Useful when you know the early gates in your circuit are the noisiest (for example, state preparation gates on hardware with warm-up effects).
fold_gates_from_right: Selects gates from the end of the circuit. Since later gates have less time for decoherence to wash out their errors, folding them can be more effective on T1/T2-limited hardware.
fold_gates_at_random: Randomly selects which gates to fold. This avoids systematic bias from always folding the same gates and tends to produce more uniform noise amplification across the circuit. Recommended as a default unless you have reason to prefer a specific ordering.

from mitiq.zne.scaling import (
    fold_global,
    fold_gates_at_random,
    fold_gates_from_left,
    fold_gates_from_right,
)
import cirq

# Build a simple 2-qubit circuit
q0, q1 = cirq.LineQubit.range(2)
circuit = cirq.Circuit([
    cirq.H(q0),
    cirq.CNOT(q0, q1),
    cirq.rz(0.5)(q0),
    cirq.CNOT(q0, q1),
])

print("Original circuit:")
print(circuit)
print(f"Gate count: {len(list(circuit.all_operations()))}")

# Global fold at scale factor 3 (every gate folded once)
folded_global = fold_global(circuit, scale_factor=3.0)
print(f"\nGlobal fold (3x): {len(list(folded_global.all_operations()))} gates")

# Partial fold at scale factor 1.5 (half the gates folded)
folded_random = fold_gates_at_random(circuit, scale_factor=1.5)
print(f"Random fold (1.5x): {len(list(folded_random.all_operations()))} gates")

folded_left = fold_gates_from_left(circuit, scale_factor=1.5)
print(f"Left fold (1.5x): {len(list(folded_left.all_operations()))} gates")

folded_right = fold_gates_from_right(circuit, scale_factor=1.5)
print(f"Right fold (1.5x): {len(list(folded_right.all_operations()))} gates")

When to use each folding strategy:

Strategy	Best for	Drawback
`fold_global`	Uniform noise models	Cannot produce all fractional scale factors cleanly
`fold_gates_at_random`	General-purpose (recommended default)	Non-deterministic, slight variance between runs
`fold_gates_from_left`	Circuits where early gates are noisiest	Biased amplification
`fold_gates_from_right`	Circuits where late gates are noisiest	Biased amplification

Pulse Stretching

On backends that expose pulse-level control (such as IBM backends with the Qiskit Pulse API), you can stretch the duration of microwave pulses that implement gates. A 2x-stretched CNOT pulse takes twice as long, accumulating roughly twice the decoherence error. Pulse stretching has two advantages over gate folding:

It produces continuous scale factors (not limited to odd integers or partial folding approximations).
It amplifies the actual decoherence noise (T1, T2 processes) rather than replicating the full gate error.

The disadvantage is that pulse stretching requires backend-level access and calibration data. Not all hardware platforms support it.

ZNE in Mitiq with Qiskit Circuits

Mitiq supports Qiskit circuits natively through the mitiq.interface.mitiq_qiskit module. This section shows the complete workflow for Qiskit users.

Basic Mitiq-Qiskit Integration

The key requirement is an executor function: a callable that takes a quantum circuit and returns a single float (the expectation value). Mitiq handles the noise scaling and extrapolation; you handle the circuit execution.

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from qiskit.quantum_info import SparsePauliOp, Statevector

from mitiq import zne
from mitiq.zne.inference import RichardsonFactory, LinearFactory
from mitiq.zne.scaling import fold_gates_at_random


def make_ghz_circuit(n_qubits: int) -> QuantumCircuit:
    """Create an n-qubit GHZ state circuit."""
    qc = QuantumCircuit(n_qubits)
    qc.h(0)
    for i in range(n_qubits - 1):
        qc.cx(i, i + 1)
    return qc


def make_noisy_executor(noise_level: float = 0.01, shots: int = 8192):
    """Return an executor that simulates a Qiskit circuit with depolarizing noise.

    The executor measures the ZZZZ expectation value (all-Z correlator) for a
    4-qubit circuit. Mitiq calls this function at each noise scale factor.
    """
    # Set up noise model: depolarizing error on all 2-qubit gates
    noise_model = NoiseModel()
    error_2q = depolarizing_error(noise_level, 2)
    noise_model.add_all_qubit_quantum_error(error_2q, ["cx"])

    simulator = AerSimulator(noise_model=noise_model)

    def executor(circuit: QuantumCircuit) -> float:
        # Add measurements to the circuit
        circ = circuit.copy()
        circ.measure_all()

        # Transpile and run
        t_circ = transpile(circ, simulator)
        result = simulator.run(t_circ, shots=shots).result()
        counts = result.get_counts()

        # Compute <ZZZZ> expectation value from counts
        n_qubits = circuit.num_qubits
        exp_val = 0.0
        for bitstring, count in counts.items():
            # Parity of the bitstring: +1 if even number of 1s, -1 if odd
            parity = (-1) ** bitstring.count("1")
            exp_val += parity * count
        exp_val /= shots
        return exp_val

    return executor


# Build the circuit
circuit = make_ghz_circuit(4)
print(circuit)

# Create a noisy executor (1% depolarizing noise on CX gates)
executor = make_noisy_executor(noise_level=0.01, shots=8192)

# Run without mitigation
unmitigated = executor(circuit)
print(f"Unmitigated <ZZZZ>: {unmitigated:.4f}")

# Run with ZNE (Richardson extrapolation)
factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
mitigated = zne.execute_with_zne(
    circuit=circuit,
    executor=executor,
    factory=factory,
    scale_noise=fold_gates_at_random,
)
print(f"ZNE mitigated <ZZZZ>: {mitigated:.4f}")
print(f"Ideal <ZZZZ>: 1.0000")

Important Notes on the Executor

The executor must accept a single QuantumCircuit argument and return a single float. If you need additional parameters (noise level, shots, backend), use a closure or functools.partial.
The circuit Mitiq passes to the executor already has gates folded. Do not add extra folding inside the executor.
The executor should include measurement and classical post-processing. Mitiq does not handle measurements for you.
For Qiskit circuits, Mitiq converts them internally to Cirq for folding, then converts back. This conversion supports standard gates but may fail on custom gates or pulse-level instructions.

ZNE in Qiskit Runtime

Qiskit Runtime integrates ZNE directly into the Estimator primitive. No circuit modification code is needed; Runtime handles noise scaling internally using backend calibration data.

Basic Usage with EstimatorV2

from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2 as Estimator
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager

# Build a test circuit
qc = QuantumCircuit(2)
qc.h(0)
qc.cx(0, 1)

observable = SparsePauliOp("ZZ")

service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=2)

# EstimatorV2 requires ISA circuits: transpile to the backend's
# basis gates and layout, then map the observable to the new layout
pm = generate_preset_pass_manager(backend=backend, optimization_level=1)
isa_qc = pm.run(qc)
isa_observable = observable.apply_layout(isa_qc.layout)

with Session(backend=backend) as session:
    # Resilience level 2 activates ZNE automatically
    options = EstimatorOptions()
    options.resilience_level = 2

    estimator = Estimator(mode=session, options=options)
    job = estimator.run([(isa_qc, isa_observable)])
    result = job.result()
    print(f"Qiskit Runtime ZNE result: {result[0].data.evs:.4f}")

Full ZNE Customization in Qiskit Runtime

The resilience_level = 2 setting activates ZNE with sensible defaults. You can override every ZNE parameter individually for full control.

from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2 as Estimator
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager

qc = QuantumCircuit(4)
qc.h(0)
for i in range(3):
    qc.cx(i, i + 1)

observable = SparsePauliOp("ZZZZ")

service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=4)

# Transpile to an ISA circuit and remap the observable
pm = generate_preset_pass_manager(backend=backend, optimization_level=1)
isa_qc = pm.run(qc)
isa_observable = observable.apply_layout(isa_qc.layout)

with Session(backend=backend) as session:
    options = EstimatorOptions()

    # Resilience level 2 enables ZNE; you can then fine-tune below
    options.resilience_level = 2

    # ZNE-specific settings
    options.resilience.zne_mitigation = True
    options.resilience.zne.noise_factors = [1, 3, 5]

    # Extrapolator options (see ZneOptions in qiskit-ibm-runtime):
    #   "linear"                - first-order polynomial fit
    #   "polynomial_degree_2"   - second-order polynomial fit
    #   "polynomial_degree_3"   - third-order fit (degrees up to 7 supported)
    #   "exponential"           - single exponential decay fit
    #   "double_exponential"    - sum of two exponential decays
    #   "fallback"              - return the lowest-noise-factor data unchanged
    # The default is the sequence ("exponential", "linear")
    options.resilience.zne.extrapolator = "exponential"

    # TREX (Twirled Readout Error eXtinction) is bundled with resilience_level >= 1
    # It handles measurement errors so ZNE only needs to correct gate errors
    options.resilience.measure_mitigation = True

    estimator = Estimator(mode=session, options=options)
    job = estimator.run([(isa_qc, isa_observable)])
    result = job.result()

    print(f"ZNE mitigated <ZZZZ>: {result[0].data.evs:.4f}")
    print(f"Standard error: {result[0].data.stds:.4f}")

Resilience Levels and Their Relationship to ZNE

Level	What it enables	ZNE active?
0	No mitigation	No
1	TREX (readout error mitigation)	No
2	TREX + ZNE	Yes

At level 2, Runtime applies TREX first to correct measurement errors, then applies ZNE to correct gate errors. This layered approach is more effective than either technique alone.

ZNE with SamplerV2: Why It Does Not Apply

SamplerV2 returns probability distributions (raw counts or quasi-probabilities), not expectation values. ZNE is fundamentally an expectation-value technique: it extrapolates a scalar E(lambda) to E(0). There is no well-defined way to extrapolate an entire probability distribution to zero noise because each probability would need independent extrapolation and the results might not sum to 1.

If you need error-mitigated probability distributions, consider:

M3 (Matrix-free Measurement Mitigation): corrects readout errors on the full distribution. Available via mthree or Qiskit Runtime’s TREX.
PEC (Probabilistic Error Cancellation): provides unbiased error mitigation for distributions, but requires a full noise model and has exponential sampling overhead.

For expectation values, use EstimatorV2 with ZNE. For distributions, use SamplerV2 with readout mitigation only.

Extrapolation Methods Compared

Different extrapolators make different assumptions about how noise scales. Choosing the right one matters.

Linear Extrapolation

Assumes E(lambda) = a + b * lambda. Requires only two data points. Has the lowest variance among all extrapolators because it uses the fewest parameters, but it introduces systematic bias if the true noise curve is nonlinear. Best for shallow circuits where the noise is weak and the quadratic correction is negligible.

Richardson Extrapolation

Polynomial interpolation that passes exactly through all data points. With n data points, it cancels noise terms up to order n - 1. Richardson extrapolation is exact when the true E(lambda) is a polynomial of degree n - 1 or less. For higher-order terms, the residual bias shrinks as you add more scale factors, but the variance grows (the coefficients alternate in sign and grow in magnitude).

Polynomial Fit (PolyFactory)

Fits a polynomial of specified degree to possibly more data points than the polynomial order. This is a least-squares fit rather than interpolation, so it smooths out shot noise better than Richardson. Use PolyFactory(scale_factors=[1, 2, 3, 4, 5], order=2) to fit a quadratic through five data points.

Exponential Fit (ExpFactory)

Assumes E(lambda) = A + B * exp(-C * lambda). This functional form is physically motivated: for circuits dominated by T2 decoherence, the expectation value decays exponentially with circuit duration (and therefore with noise scale). Often outperforms polynomial extrapolators on real hardware. Requires an estimate of the asymptote A (the value E(lambda) approaches at infinite noise). For Pauli observables, this is typically 0.

from mitiq.zne.inference import (
    LinearFactory,
    RichardsonFactory,
    PolyFactory,
    ExpFactory,
)

# Linear: 2 points, lowest variance, potential bias
linear = LinearFactory(scale_factors=[1.0, 3.0])

# Richardson: exact polynomial cancellation, n points cancel n-1 orders
richardson = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])

# Polynomial least-squares: more points than order, smooths shot noise
poly2 = PolyFactory(scale_factors=[1.0, 2.0, 3.0, 4.0, 5.0], order=2)

# Exponential: physically motivated for decoherence-dominated noise
exp_factory = ExpFactory(scale_factors=[1.0, 3.0, 5.0], asymptote=0.0)

Summary Table

Method	Assumption	Min. points	Variance	Bias	Best for
Linear	Linear in lambda	2	Lowest	High if nonlinear	Shallow circuits, quick estimates
Richardson	Polynomial of degree n-1	n	Grows with n	Exact for degree <= n-1	Moderate-depth circuits, when polynomial model holds
PolyFactory	Polynomial least-squares	> order	Medium	Smoothed	Many data points available
ExpFactory	Exponential decay	3	Medium	Low for T2-dominated noise	Real hardware, decoherence-limited circuits

Choosing Scale Factors

The choice of scale factors significantly affects ZNE quality. Here are the key considerations.

Odd Integers for Gate Folding

Each gate fold replaces G with G G† G, tripling the gate count. This means one fold gives scale factor 3, two folds give scale factor 5, and so on. The achievable integer scale factors for gate folding are 1, 3, 5, 7, 9, ... (odd integers). Even integers like 2 or 4 are not achievable with global folding but can be approximated with partial folding.

For simplicity and reproducibility, prefer odd integer scale factors: [1, 3, 5] is a standard choice.

Spanning the Right Range

Scale factors should span a range where the expectation value visibly changes but has not hit the noise floor. If the signal is already zero at scale factor 3, using [1, 3, 5] gives you two noise-floor measurements that carry no useful information.

Diagnostic procedure: Run at [1, 2, 3, 4, 5] (using partial folding for even values) and plot E vs. scale factor. Identify the scale factor where the signal approaches zero or the noise floor. Only use scale factors below that threshold.

import numpy as np
import matplotlib.pyplot as plt
from mitiq import zne
from mitiq.zne.scaling import fold_gates_at_random

# Diagnostic: sweep scale factors to find useful range
test_scale_factors = [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
expectations = []

for sf in test_scale_factors:
    # Fold the circuit and measure (using your executor)
    folded = fold_gates_at_random(circuit, scale_factor=sf)
    exp_val = executor(folded)  # your noisy executor
    expectations.append(exp_val)

plt.figure(figsize=(8, 5))
plt.plot(test_scale_factors, expectations, "o-", label="Measured E(lambda)")
plt.axhline(y=0, color="gray", linestyle="--", alpha=0.5, label="Noise floor")
plt.xlabel("Scale factor (lambda)")
plt.ylabel("Expectation value")
plt.title("ZNE Diagnostic: Signal vs. Noise Scale")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Rule of thumb: use scale factors where E(lambda) is still above
# ~20% of the unmitigated value E(1)
threshold = 0.2 * expectations[0]
usable = [sf for sf, e in zip(test_scale_factors, expectations) if abs(e) > threshold]
print(f"Usable scale factors: {usable}")

Pulse Stretching: Continuous Values

For pulse stretching, you are not limited to odd integers. Any positive real number works as a scale factor, including [1.0, 1.5, 2.0, 2.5, 3.0]. Continuous scale factors allow finer sampling of the noise curve and better fits.

Statistical Variance of ZNE

ZNE amplifies both the signal and the statistical noise. Understanding this trade-off is essential for setting shot budgets.

Shot Noise at Each Scale Factor

For a Pauli observable with expectation value E, the standard deviation of the sample mean from N shots is:

sigma = sqrt((1 - E^2) / N)

At higher noise levels, |E| decreases, so 1 - E^2 increases, and the shot noise grows. For example, with N = 1000 shots:

Scale factor	E(lambda)	sigma
1 (unmitigated)	0.90	sqrt(1 - 0.81) / sqrt(1000) = 0.0138
3	0.70	sqrt(1 - 0.49) / sqrt(1000) = 0.0226
5	0.45	sqrt(1 - 0.2025) / sqrt(1000) = 0.0282

The shot noise at scale factor 5 is roughly twice the shot noise at scale factor 1.

Uncertainty Propagation for Linear ZNE

For linear extrapolation with two data points at scale factors lambda_1 = 1 and lambda_2 = 3:

E_0 = (lambda_2 * E(lambda_1) - lambda_1 * E(lambda_2)) / (lambda_2 - lambda_1)
    = (3 * E(1) - 1 * E(3)) / 2

The variance of the extrapolated value is:

Var(E_0) = (lambda_2 / (lambda_2 - lambda_1))^2 * Var(E(lambda_1))
         + (lambda_1 / (lambda_2 - lambda_1))^2 * Var(E(lambda_2))
         = (3/2)^2 * sigma_1^2 + (1/2)^2 * sigma_2^2
         = 2.25 * sigma_1^2 + 0.25 * sigma_2^2

Plugging in the values from the table above:

Var(E_0) = 2.25 * (0.0138)^2 + 0.25 * (0.0226)^2
         = 2.25 * 0.000190 + 0.25 * 0.000511
         = 0.000428 + 0.000128
         = 0.000556

sigma(E_0) = sqrt(0.000556) = 0.0236

Compare this to the unmitigated standard deviation of 0.0138. ZNE reduces the bias (from ~0.10 to ~0.01) but increases the variance by a factor of ~1.7. This is the fundamental bias-variance trade-off of ZNE.

Practical Implication: Shot Budgets

To maintain the same statistical precision after ZNE, you need more shots. A rough rule: if the extrapolation coefficients amplify the variance by factor k, you need k times as many shots per circuit execution. For Richardson extrapolation with [1, 3, 5], the variance amplification is typically 3x to 10x, depending on the coefficient magnitudes.

import numpy as np

def zne_variance_amplification(scale_factors, noise_curve, n_shots):
    """Compute the variance amplification factor for Richardson ZNE.

    Args:
        scale_factors: List of noise scale factors.
        noise_curve: Callable, E(lambda).
        n_shots: Number of shots per circuit.

    Returns:
        Ratio of ZNE variance to unmitigated variance.
    """
    # Richardson coefficients via Vandermonde system
    n = len(scale_factors)
    V = np.vander(scale_factors, increasing=True)
    e_0_vec = np.zeros(n)
    e_0_vec[0] = 1.0
    coeffs = np.linalg.solve(V.T, e_0_vec)

    # Variance at each scale factor
    variances = []
    for sf in scale_factors:
        E = noise_curve(sf)
        var = (1 - E**2) / n_shots
        variances.append(var)

    # Propagated variance
    zne_var = sum(c**2 * v for c, v in zip(coeffs, variances))

    # Unmitigated variance
    E1 = noise_curve(scale_factors[0])
    unmitigated_var = (1 - E1**2) / n_shots

    return zne_var / unmitigated_var

# Example
noise_curve = lambda lam: 0.95 * np.exp(-0.1 * lam)
amplification = zne_variance_amplification([1, 3, 5], noise_curve, 1000)
print(f"Variance amplification factor: {amplification:.1f}x")
print(f"Need ~{amplification:.0f}x more shots to match unmitigated precision")

Benchmark: ZNE on a 4-Qubit GHZ State

This section provides a complete, runnable benchmark comparing unmitigated results, linear ZNE, and Richardson ZNE on a concrete circuit.

Setup

Circuit: 4-qubit GHZ state (H on qubit 0, then CNOT chain)
Observable: ZZZZ (all-Z correlator)
Ideal value: +1.0 (the GHZ state (|0000> + |1111>) / sqrt(2) has <ZZZZ> = 1)
Noise model: 10% depolarizing noise on every CX gate, deliberately large so the mitigation effect is easy to see in a short simulation
Shots: 8192 per circuit execution

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from mitiq import zne
from mitiq.zne.inference import RichardsonFactory, LinearFactory
from mitiq.zne.scaling import fold_gates_at_random


def make_ghz4() -> QuantumCircuit:
    """4-qubit GHZ state circuit."""
    qc = QuantumCircuit(4)
    qc.h(0)
    qc.cx(0, 1)
    qc.cx(1, 2)
    qc.cx(2, 3)
    return qc


def make_executor(noise_level: float, shots: int):
    """Create an executor that measures <ZZZZ> on a noisy simulator."""
    noise_model = NoiseModel()
    noise_model.add_all_qubit_quantum_error(
        depolarizing_error(noise_level, 2), ["cx"]
    )
    sim = AerSimulator(noise_model=noise_model)

    def executor(circuit: QuantumCircuit) -> float:
        circ = circuit.copy()
        circ.measure_all()
        t_circ = transpile(circ, sim)
        result = sim.run(t_circ, shots=shots).result()
        counts = result.get_counts()
        exp_val = 0.0
        for bitstring, count in counts.items():
            parity = (-1) ** bitstring.count("1")
            exp_val += parity * count
        return exp_val / shots

    return executor


# Configuration
noise_level = 0.1
shots = 8192
circuit = make_ghz4()
executor = make_executor(noise_level, shots)

# 1. Unmitigated
unmitigated = executor(circuit)
print(f"Unmitigated <ZZZZ>:       {unmitigated:.4f}  (error: {abs(1.0 - unmitigated):.4f})")

# 2. ZNE with linear extrapolation
linear_factory = LinearFactory(scale_factors=[1.0, 3.0])
mitigated_linear = zne.execute_with_zne(
    circuit=circuit,
    executor=executor,
    factory=linear_factory,
    scale_noise=fold_gates_at_random,
)
print(f"ZNE (linear) <ZZZZ>:      {mitigated_linear:.4f}  (error: {abs(1.0 - mitigated_linear):.4f})")

# 3. ZNE with Richardson extrapolation
richardson_factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
mitigated_richardson = zne.execute_with_zne(
    circuit=circuit,
    executor=executor,
    factory=richardson_factory,
    scale_noise=fold_gates_at_random,
)
print(f"ZNE (Richardson) <ZZZZ>:  {mitigated_richardson:.4f}  (error: {abs(1.0 - mitigated_richardson):.4f})")

print(f"\nIdeal <ZZZZ>:             1.0000")

Expected Results

With 10% depolarizing noise on each of the three CX gates, typical results are:

Method	`<ZZZZ>`	Residual error
Unmitigated	~0.72	~0.28
Linear ZNE	~0.94	~0.06
Richardson ZNE	~0.97	~0.03

Richardson outperforms linear because the true noise curve has significant quadratic contribution at 10% per CX. The residual error in Richardson comes from higher-order terms and shot noise.

Clifford Data Regression (CDR): A Complement to ZNE

Mitiq implements Clifford Data Regression (CDR), a technique that complements ZNE by learning a noise model from “training” circuits whose ideal outputs are efficiently computable.

How CDR Works

Generate training circuits: Replace non-Clifford gates (like T and Rz(theta)) in your target circuit with nearby Clifford gates. The resulting “near-Clifford” circuits can be simulated classically in polynomial time.
Collect training data: Run each training circuit on the noisy hardware to get noisy expectation values. Also compute the ideal expectation values classically.
Fit a correction model: Learn a linear (or nonlinear) mapping from noisy to ideal expectation values using the training data.
Apply to target: Run the original circuit on noisy hardware, then apply the learned correction.

CDR in Mitiq

from mitiq import cdr
from qiskit import QuantumCircuit
import numpy as np


def ideal_executor(circuit: QuantumCircuit) -> float:
    """Compute the ideal (noiseless) expectation value via statevector simulation."""
    from qiskit_aer import AerSimulator

    sim = AerSimulator(method="statevector")
    circ = circuit.copy()
    circ.save_statevector()
    result = sim.run(circ).result()
    sv = result.get_statevector()

    # Compute <ZZZZ> from statevector
    n_qubits = circuit.num_qubits
    probs = np.abs(sv.data) ** 2
    exp_val = 0.0
    for i, p in enumerate(probs):
        bitstring = format(i, f"0{n_qubits}b")
        parity = (-1) ** bitstring.count("1")
        exp_val += parity * p
    return exp_val


# noisy_executor is the same as in the ZNE example above
# (accepts a QuantumCircuit, returns a float)

# Run CDR
mitigated_cdr = cdr.execute_with_cdr(
    circuit=circuit,
    executor=executor,  # noisy executor from earlier
    simulator=ideal_executor,  # noiseless simulation for training data
    num_training_circuits=20,
    fraction_non_clifford=0.1,  # Replace 10% of non-Clifford gates
)
print(f"CDR mitigated <ZZZZ>: {mitigated_cdr:.4f}")

When CDR Beats ZNE

Scenario	ZNE	CDR
Noise is predominantly incoherent (depolarizing, T1/T2)	Good	Good
Noise has coherent components (systematic over/under-rotation)	Poor (non-monotonic)	Better (learns the systematic pattern)
Circuit has few non-Clifford gates	Overkill	Ideal (many training circuits available)
Circuit is all Clifford gates	Works but unnecessary	Cannot generate training data (no gates to replace)
No classical simulator available	Works	Cannot work (needs ideal executor)

CDR requires a classical simulator for the training circuits. For circuits with many non-Clifford gates, generating good training circuits becomes harder because the replacement changes the circuit significantly.

When ZNE Fails: Concrete Failure Modes

ZNE is not a universal fix. Knowing when it fails saves you from silently getting wrong answers.

Failure Mode 1: Non-Monotonic Noise

ZNE assumes that increasing the noise scale factor always pushes E(lambda) closer to the noise floor (typically 0 for Pauli observables). This assumption breaks when coherent errors are present.

Coherent errors (systematic gate over-rotations, for example) accumulate deterministically. Folding a gate three times triples the coherent error angle, which can cause the expectation value to oscillate rather than decay monotonically.

Example: Suppose a CX gate has a coherent Z-rotation error of angle theta. After folding to scale factor 3, the accumulated angle is 3*theta. If 3*theta crosses pi, the expectation value flips sign, and the extrapolation curve is non-monotonic.

Diagnosis: Plot E(lambda) vs. lambda for several scale factors. If the curve is not monotonically decreasing (in absolute value), ZNE extrapolation will be unreliable.

import numpy as np

# Simulated non-monotonic noise curve (coherent error)
def coherent_noise_curve(lam):
    """E(lambda) with both coherent and incoherent components."""
    incoherent = 0.95 * np.exp(-0.05 * lam)  # Decays monotonically
    coherent = np.cos(0.3 * lam)               # Oscillates
    return incoherent * coherent

lambdas = np.linspace(1, 7, 50)
values = [coherent_noise_curve(l) for l in lambdas]

# Check for non-monotonicity
diffs = np.diff(values)
sign_changes = np.sum(np.diff(np.sign(diffs)) != 0)
print(f"Sign changes in E(lambda): {sign_changes}")
if sign_changes > 0:
    print("WARNING: Non-monotonic noise curve detected. ZNE may be unreliable.")

Failure Mode 2: Overfolding

At high scale factors (7x, 9x, or above), the circuit noise may be so large that the expectation value is indistinguishable from zero within shot noise. These data points contribute only noise to the extrapolation, inflating the variance without improving the bias correction.

Rule of thumb: If |E(lambda)| < 2 * sigma(lambda) (the signal-to-noise ratio drops below 2), that scale factor is not useful. Drop it.

Failure Mode 3: Wrong Extrapolator

Using linear extrapolation when the true curve is exponential introduces a systematic bias that does not decrease with more shots. The linear fit overshoots or undershoots the zero-noise intercept depending on the curvature.

Diagnosis: Run at four or more scale factors. Fit both linear and exponential models. Compare the reduced chi-squared (chi^2 / dof) of each fit. If the linear fit has chi^2/dof >> 1, the model is wrong.

import numpy as np
from scipy.optimize import curve_fit

# Suppose you have measurements at these scale factors
scale_factors = np.array([1.0, 3.0, 5.0, 7.0])
measurements = np.array([0.85, 0.62, 0.41, 0.25])
uncertainties = np.array([0.015, 0.020, 0.025, 0.030])

# Linear fit
def linear_model(x, a, b):
    return a + b * x

popt_lin, pcov_lin = curve_fit(linear_model, scale_factors, measurements,
                                sigma=uncertainties, absolute_sigma=True)
residuals_lin = measurements - linear_model(scale_factors, *popt_lin)
chi2_lin = np.sum((residuals_lin / uncertainties) ** 2)
dof_lin = len(measurements) - 2
print(f"Linear fit: E(0) = {popt_lin[0]:.4f}, chi2/dof = {chi2_lin/dof_lin:.2f}")

# Exponential fit
def exp_model(x, a, b, c):
    return a + b * np.exp(-c * x)

popt_exp, pcov_exp = curve_fit(exp_model, scale_factors, measurements,
                                sigma=uncertainties, absolute_sigma=True,
                                p0=[0, 1, 0.1])
residuals_exp = measurements - exp_model(scale_factors, *popt_exp)
chi2_exp = np.sum((residuals_exp / uncertainties) ** 2)
dof_exp = len(measurements) - 3
print(f"Exp fit:    E(0) = {popt_exp[0] + popt_exp[1]:.4f}, chi2/dof = {chi2_exp/dof_exp:.2f}")

# If chi2/dof >> 1 for linear but ~1 for exponential, use exponential

Failure Mode 4: Readout-Dominated Errors

ZNE mitigates gate errors by amplifying gate noise through folding. It does not amplify or correct readout (measurement) errors. If your circuit has high readout error but low gate error, ZNE will not help. In fact, the folded circuits have the same readout error as the original, so the extrapolation “corrects” a noise component (readout) that does not scale with lambda, leading to incorrect results.

Solution: Apply readout error mitigation (TREX, M3, or matrix inversion) before or alongside ZNE. Qiskit Runtime does this automatically at resilience level 2.

Common Mistakes

Mistake 1: ZNE on Exact Simulation

Running mitiq.zne.execute_with_zne with a noiseless simulator (shots=0 or statevector simulation) is pointless. ZNE estimates the zero-noise limit from noisy data. If the data is already noiseless, ZNE returns the same value (at best) or introduces numerical artifacts (at worst). Always test ZNE with a noisy simulator or real hardware.

Mistake 2: Expecting Uniform Amplification at Non-Odd Scale Factors

Whole-circuit gate folding produces exact, uniform noise amplification only at odd integer scale factors (1, 3, 5, 7, …). For other values like 2 or 4, Mitiq’s folding functions (including fold_global) fall back to folding only a subset of gates, so the amplification is approximate and less uniform across the circuit. The call succeeds, but the effective noise scaling may deviate from the requested factor.

# Approximate: non-odd scale factors force partial folding
# factory = RichardsonFactory(scale_factors=[1.0, 2.0, 4.0])

# Cleanest option: odd integers with global folding
factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
result = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_global)

# Intermediate scale factors: use randomized partial folding
factory = RichardsonFactory(scale_factors=[1.0, 2.0, 3.0])
result = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_gates_at_random)

Mistake 3: Ignoring Readout Error

Applying ZNE to a circuit where readout error dominates (common on superconducting hardware with 1-5% readout error per qubit) gives misleading results. ZNE cannot correct readout errors because folding does not amplify them.

Fix: Apply TREX or M3 readout mitigation first, then apply ZNE to the readout-corrected expectation values. Qiskit Runtime’s resilience level 2 does this automatically. In Mitiq, you must handle readout correction yourself inside the executor function.

Mistake 4: Not Clipping Extrapolated Values

After extrapolation, the estimated E_0 may fall outside the physical range [-1, +1] for Pauli observables due to shot noise. For example, Richardson extrapolation with noisy data might return E_0 = 1.05. Always clip to physical bounds:

mitigated = zne.execute_with_zne(circuit, executor, factory, scale_noise=fold_gates_at_random)

# Clip to physical range for a Pauli observable
mitigated_clipped = np.clip(mitigated, -1.0, 1.0)

This is especially important when the mitigated value feeds into downstream computations (like VQE energy estimation) that assume physical bounds.

Mitiq vs. Qiskit Runtime ZNE: Comparison Table

Feature	Mitiq	Qiskit Runtime
Gate selection control	Full: fold specific gates, specific qubits, or random subsets	Limited: Runtime chooses folding strategy
Supported backends	Any (Cirq, Qiskit, Braket, pyQuil via converters)	IBM Quantum hardware only
Scale factor flexibility	Arbitrary real numbers via partial folding	Odd integers (gate folding internally)
Extrapolation methods	Linear, Richardson, Polynomial, Exponential, custom	Linear, quadratic, cubic, quartic, exponential, double exponential
Custom extrapolators	Yes (subclass `Factory`)	No
Integration with other mitigation	PEC, CDR, DDD composable in pipelines	TREX bundled automatically at resilience >= 1
Readout mitigation	Manual (implement in executor)	Automatic (TREX at resilience >= 1)
Access to intermediate results	Yes (factory stores all scale factor results)	No (only final mitigated value returned)
Ease of use	Moderate (write executor, choose factory and folding)	Low (set resilience_level = 2)
Cost model	Pay for each circuit execution (3 scale factors = 3x cost)	Same (3 scale factors = 3x cost), but Runtime may optimize batching
Classical overhead	Minimal	Minimal
Open source	Yes (Apache 2.0)	Runtime client is open source; server is proprietary
Documentation	Extensive tutorials, API docs, and examples	Good API docs, fewer tutorials on ZNE internals

When to Choose Mitiq

You need fine-grained control over which gates are folded.
You work with non-IBM hardware (IonQ, Rigetti, neutral atoms).
You want to compose ZNE with PEC or CDR in a single pipeline.
You need access to the intermediate expectation values at each scale factor for diagnostics.
You are prototyping mitigation strategies in simulation before deploying to hardware.

When to Choose Qiskit Runtime

You are already using IBM Quantum hardware through Qiskit Runtime.
You want the simplest possible integration: one option flag activates ZNE with sensible defaults.
You benefit from Runtime’s backend characterization data for more accurate noise scaling.
You want TREX readout mitigation bundled automatically with ZNE.

Using Both Together

It is possible to use Mitiq’s ZNE on circuits that you eventually run on IBM hardware. You write the executor to submit jobs through Qiskit Runtime (with resilience_level = 0 to avoid double-mitigation), and let Mitiq handle the folding and extrapolation. This gives you Mitiq’s flexibility with IBM’s hardware access:

from qiskit_ibm_runtime import QiskitRuntimeService, Session, EstimatorV2
from qiskit_ibm_runtime.options import EstimatorOptions
from qiskit.quantum_info import SparsePauliOp
from mitiq import zne
from mitiq.zne.inference import RichardsonFactory
from mitiq.zne.scaling import fold_gates_at_random


def make_ibm_executor(session, observable):
    """Create an executor that runs on IBM hardware via Qiskit Runtime."""
    options = EstimatorOptions()
    options.resilience_level = 0  # No Runtime mitigation; Mitiq handles ZNE

    estimator = EstimatorV2(mode=session, options=options)

    def executor(circuit):
        job = estimator.run([(circuit, observable)])
        result = job.result()
        return float(result[0].data.evs)

    return executor


# Usage (requires IBM Quantum credentials)
# service = QiskitRuntimeService()
# backend = service.least_busy(operational=True, simulator=False, min_num_qubits=4)
# observable = SparsePauliOp("ZZZZ")
#
# with Session(backend=backend) as session:
#     executor = make_ibm_executor(session, observable)
#     factory = RichardsonFactory(scale_factors=[1.0, 3.0, 5.0])
#     mitigated = zne.execute_with_zne(
#         circuit=circuit,
#         executor=executor,
#         factory=factory,
#         scale_noise=fold_gates_at_random,
#     )
#     print(f"Mitiq ZNE on IBM hardware: {mitigated:.4f}")

Limitations of ZNE

ZNE has fundamental limitations beyond the specific failure modes discussed above.

Circuit depth ceiling: For circuits with more than roughly 50-100 two-qubit gates on current hardware, the noise at scale factor 1 is already severe. Folding to 3x or 5x pushes the signal into the noise floor, making extrapolation impossible. ZNE is most effective on moderate-depth circuits where the unmitigated expectation value is still meaningfully different from zero.

No free lunch on variance: ZNE reduces bias at the cost of increased variance. The Richardson coefficients for [1, 3, 5] are (15/8, -10/8, 3/8), which amplify shot noise by the sum of squared coefficients divided by the number of scale factors. For high-precision results, you need substantially more shots than the unmitigated case.

Assumes gate noise dominates: ZNE corrects gate errors only. Readout errors, crosstalk, and leakage errors are not amplified by gate folding, so they appear as a constant offset that ZNE misinterprets as part of the zero-noise value. Always combine ZNE with readout mitigation.

Model dependence: The extrapolated value depends on the choice of extrapolator. If you use a linear model but the true curve is exponential, the result is biased regardless of how many shots you take. There is no fully model-independent way to extrapolate.

Not composable with all techniques: ZNE and PEC (Probabilistic Error Cancellation) address the same problem (gate errors) with different trade-offs. Applying both simultaneously is redundant and can degrade performance. Choose one based on your noise model knowledge and sampling budget.

Summary

ZNE is the most accessible error mitigation technique available today: it works on any hardware, requires no noise model, and integrates into existing workflows with minimal code changes. The choice between Mitiq and Qiskit Runtime comes down to how much control you need.

Start with Qiskit Runtime’s resilience_level = 2 for quick experiments on IBM hardware. Move to Mitiq when you need custom folding strategies, non-IBM backends, or composable mitigation pipelines. In either case, always validate with the diagnostic sweep (plot E vs. scale factor), choose your extrapolator based on the shape of the noise curve, and budget enough shots to handle the variance amplification.