Error Mitigation on Amazon Braket
Apply zero-noise extrapolation and measurement error mitigation techniques on Amazon Braket using local and density matrix simulators.
The Physics of Noise in NISQ Devices
Every operation on a real quantum processor introduces error. Understanding the specific sources and magnitudes of that error is the first step toward mitigating it.
Gate errors are the dominant source for most circuits. Single-qubit gates on modern superconducting processors achieve fidelities of 99.8-99.95%, meaning a 0.05-0.2% chance of error per gate. Two-qubit gates (CNOT, CZ) are significantly worse, with typical fidelities of 99.0-99.5%. That 0.5-1.0% error per two-qubit gate sounds small, but it compounds multiplicatively across a circuit.
Consider a 4-qubit GHZ state preparation with 3 CNOT gates at 99.5% fidelity each: the circuit fidelity is roughly 0.995^3 = 98.5%. Manageable. Now scale to a 20-qubit circuit with 50 two-qubit gates: 0.995^50 = 77.8%. At 99.0% fidelity per gate, that drops to 0.99^50 = 60.5%. The signal is already severely degraded.
Decoherence limits how long qubits hold their quantum state. Two timescales matter:
- T1 (energy relaxation): the time for an excited qubit to decay to the ground state. Typical values range from 50 to 300 microseconds on superconducting hardware, and milliseconds to seconds on trapped-ion devices.
- T2 (dephasing): the time for a superposition to lose its phase coherence. T2 is always bounded by 2*T1, and in practice is often shorter due to low-frequency noise. Typical values: 50-200 microseconds on superconducting qubits.
A circuit that takes 10 microseconds to execute on hardware with T2 = 100 microseconds loses roughly 10% of its coherence to dephasing alone, on top of gate errors.
Readout errors occur during measurement. A qubit in state |0> might be reported as “1” with probability 1-5%, and vice versa. These errors are typically asymmetric: the probability of misreading |1> as “0” (due to T1 decay during readout) is usually higher than misreading |0> as “1”. On current IBM and Rigetti hardware, readout error rates of 1-3% per qubit are common.
The compounding problem: for a circuit with g two-qubit gates at fidelity f, n qubits with readout error r, and circuit duration t relative to T2:
- Gate fidelity contribution: f^g
- Coherence contribution: exp(-t/T2) (approximate)
- Readout contribution: (1-r)^n for a single bitstring
These multiply together. A circuit with 20 two-qubit gates at 99.5%, 4 qubits with 2% readout error, and t/T2 = 0.1 gives roughly 0.995^20 * exp(-0.1) * 0.98^4 = 0.905 * 0.905 * 0.922 = 75.5% overall fidelity. Error mitigation techniques recover useful information from these noisy results.
Zero-Noise Extrapolation (ZNE)
ZNE estimates the noiseless expectation value by deliberately amplifying noise and then extrapolating back to the zero-noise limit.
Why ZNE Works: The Mathematical Foundation
The core idea rests on a simple observation: the expectation value of an observable is a smooth function of the noise level. If we write the noise scale as lambda (where lambda=1 is the physical noise level), the measured expectation value can be expanded as:
E(lambda) = E_ideal + c1*lambda + c2*lambda^2 + c3*lambda^3 + ...
At lambda=0, the noise vanishes and we recover the ideal expectation value E_ideal. The problem is that we cannot run at lambda=0 on real hardware. What we can do is run at lambda=1 (base noise), lambda=3, lambda=5, and so on, by artificially amplifying the noise via circuit folding. With measurements at multiple noise levels, we fit a model to E(lambda) and read off the intercept at lambda=0.
The key assumption is that the noise scales predictably with the folding factor. Gate folding (described below) triples, quintuples, etc. the number of noisy gates, so the noise contribution scales roughly linearly with the scale factor when the per-gate error rate is small. This assumption breaks down when gate errors are large (above ~5%) or when the circuit is so deep that decoherence dominates over gate errors.
The Three Steps of ZNE
- Run the circuit at the base noise level (scale factor 1) to get E(1).
- Amplify the noise by folding the circuit at scale factors 3, 5, 7, etc. and measure E(3), E(5), E(7).
- Fit a model to the (scale_factor, expectation_value) data and extrapolate to scale factor 0.
Gate Folding vs. Pulse Stretching
Two main approaches exist for amplifying noise:
Gate folding replaces each gate G with the sequence G * G_dag * G. Since G_dag * G = I (the identity), the ideal action of the circuit is unchanged, but the physical noise triples because three gates execute instead of one. You can fold the entire circuit (global folding) or fold individual gates selectively (local folding). Global folding at scale factor s appends (s-1)/2 copies of (circuit_adjoint + circuit) to the original circuit.
Pulse stretching slows the physical control pulses that implement each gate, increasing the time qubits spend interacting with the environment. This amplifies decoherence-type noise more naturally than gate folding. However, pulse stretching requires pulse-level access to the hardware (like IBM’s Qiskit Pulse or OQC’s pulse interface), which Amazon Braket does not currently expose for most devices.
For Braket, gate folding is the standard approach. Within gate folding, partial folding targets only the noisiest gates (typically two-qubit gates). This is more efficient because it amplifies the dominant noise source without unnecessarily inflating circuit depth from already-high-fidelity single-qubit gates.
Setting Up a Noisy Simulation
We use Braket’s density matrix simulator to model depolarizing noise.
from braket.circuits import Circuit, Observable
from braket.devices import LocalSimulator
from braket.circuits.noise import Noise
import numpy as np
# Base circuit: 4-qubit GHZ state with ZZ expectation measurement
def base_circuit():
circ = Circuit()
circ.h(0)
circ.cnot(0, 1)
circ.cnot(1, 2)
circ.cnot(2, 3)
return circ
# Add depolarizing noise to every gate
def add_noise(circuit, p):
noisy = circuit.copy()
noisy.apply_gate_noise(Noise.Depolarizing(probability=p))
return noisy
Implementing Circuit Folding
def fold_circuit(circuit, scale_factor):
"""Fold the entire circuit to amplify noise by scale_factor.
scale_factor must be an odd integer (1, 3, 5, ...).
Folding appends (circuit_adjoint + circuit) pairs.
"""
if scale_factor == 1:
return circuit
folded = circuit.copy()
n_folds = (scale_factor - 1) // 2
for _ in range(n_folds):
# Append circuit_adjoint then circuit again
folded.add_circuit(circuit.adjoint())
folded.add_circuit(circuit)
return folded
Verbatim Compilation on Braket
When you submit a circuit to a Braket QPU, the service may transpile it: reordering gates, canceling redundant operations, and remapping qubits. This is helpful for general workloads, but it undermines ZNE. If the compiler cancels the G_dag*G pairs you inserted for folding, the noise amplification disappears.
To prevent this, Braket supports verbatim compilation on certain devices (IonQ Aria, Rigetti Ankaa-class processors, and others). Verbatim mode tells the compiler to execute your circuit exactly as specified.
from braket.aws import AwsDevice
from braket.circuits import Circuit
# Select a device that supports verbatim compilation
device = AwsDevice("arn:aws:braket:us-east-1::device/qpu/rigetti/Ankaa-3")
# Build a folded circuit
folded = fold_circuit(base_circuit(), scale_factor=3)
# Submit with verbatim compilation enabled
# disable_qubit_rewiring prevents the compiler from remapping logical to physical qubits
# The circuit must use native gates for the target device when using verbatim mode
task = device.run(
folded,
shots=1000,
disable_qubit_rewiring=True
)
result = task.result()
When using verbatim mode, your circuit must be expressed in the device’s native gate set. For Rigetti devices, that typically means decomposing into RZ, RX(pi/2), and CZ gates. For IonQ devices, the native gates are GPI, GPI2, and MS. You can check the supported gates with:
# Inspect native gate set
props = device.properties
print(props.paradigm.nativeGateSet)
# Check if verbatim mode is supported
print(props.action.get("braket.ir.openqasm.program"))
For local simulation (which we use throughout this tutorial), verbatim mode is not needed because the simulator does not perform transpilation.
Running ZNE with Linear Extrapolation
device = LocalSimulator("density_matrix")
noise_prob = 0.01 # 1% depolarizing per gate
scale_factors = [1, 3, 5]
expectation_values = []
for sf in scale_factors:
folded = fold_circuit(base_circuit(), sf)
noisy = add_noise(folded, noise_prob)
noisy.expectation(Observable.Z() @ Observable.Z(), target=[0, 3])
task = device.run(noisy, shots=0) # shots=0 for exact density matrix
result = task.result()
ev = result.values[0]
expectation_values.append(ev)
print(f"Scale factor {sf}: <Z0 Z3> = {ev:.6f}")
# Linear extrapolation to zero noise
coeffs = np.polyfit(scale_factors, expectation_values, deg=1)
zne_estimate = coeffs[1] # y-intercept
print(f"\nLinear ZNE estimate: {zne_estimate:.6f}")
print(f"Ideal value (GHZ): 1.000000")
Richardson Extrapolation
Richardson extrapolation fits a polynomial of degree n-1 through n data points, which can capture nonlinear noise scaling more accurately than linear extrapolation. The idea: if you have measurements at three scale factors, you fit a quadratic and evaluate at lambda=0, capturing both the linear and quadratic terms in the noise expansion.
def richardson_extrapolation(scale_factors, values):
"""Richardson extrapolation to zero-noise limit."""
n = len(scale_factors)
# Solve the Vandermonde system
V = np.vander(scale_factors, increasing=True)[:, :n]
coeffs = np.linalg.solve(V, values)
return coeffs[0] # Zero-noise estimate is the constant term
richardson_estimate = richardson_extrapolation(scale_factors, expectation_values)
print(f"Richardson ZNE estimate: {richardson_estimate:.6f}")
Richardson extrapolation is more accurate than linear extrapolation when the noise-expectation relationship has significant curvature. However, it is also more sensitive to statistical noise in the individual measurements, because it effectively amplifies the differences between closely spaced data points. With noisy shot-based measurements (rather than exact density matrix simulation), Richardson extrapolation requires more shots per scale factor to be reliable.
Exponential Extrapolation
Some noise channels (particularly amplitude damping and dephasing) cause the expectation value to decay exponentially rather than linearly with circuit depth. In those cases, a better model is:
E(lambda) = E_ideal * exp(-a * lambda)
Taking the logarithm converts this to a linear problem: log(E(lambda)) = log(E_ideal) - a*lambda. The intercept gives log(E_ideal).
from scipy.optimize import curve_fit
def exponential_extrapolation(scale_factors, values):
"""Fit E(lambda) = A * exp(-a * lambda) and return A as the zero-noise estimate."""
# Only works when all values have the same sign
if any(v <= 0 for v in values):
raise ValueError(
"Exponential extrapolation requires all positive expectation values. "
"Use linear or Richardson extrapolation instead."
)
# Log-linear fit: log(E) = log(A) - a * lambda
log_values = np.log(values)
coeffs = np.polyfit(scale_factors, log_values, deg=1)
log_A = coeffs[1] # y-intercept in log space
return np.exp(log_A)
# Alternative: nonlinear least squares for better handling of noise
def exponential_extrapolation_nlls(scale_factors, values):
"""Nonlinear least squares fit for E(lambda) = A * exp(-a * lambda)."""
def model(x, A, a):
return A * np.exp(-a * x)
# Initial guess: A ~ first value, a ~ decay rate from first two points
a_guess = -np.log(values[1] / values[0]) / (scale_factors[1] - scale_factors[0])
p0 = [values[0] * np.exp(a_guess * scale_factors[0]), a_guess]
popt, pcov = curve_fit(model, scale_factors, values, p0=p0)
return popt[0] # A is the zero-noise estimate
When to use each extrapolation method:
| Method | Best for | Weakness |
|---|---|---|
| Linear | Low noise, few scale factors | Ignores curvature, biased for strong noise |
| Richardson | Moderate noise, polynomial scaling | Sensitive to shot noise, can overshoot |
| Exponential | Amplitude damping, dephasing-dominated circuits | Fails when expectation values cross zero |
Full ZNE Example with Error Analysis
Real hardware measurements have statistical fluctuations from finite shot counts. A production ZNE workflow needs variance estimates and confidence intervals.
from braket.circuits import Circuit, Observable
from braket.devices import LocalSimulator
from braket.circuits.noise import Noise
import numpy as np
def base_circuit():
circ = Circuit()
circ.h(0)
circ.cnot(0, 1)
circ.cnot(1, 2)
circ.cnot(2, 3)
return circ
def fold_circuit(circuit, scale_factor):
if scale_factor == 1:
return circuit
folded = circuit.copy()
n_folds = (scale_factor - 1) // 2
for _ in range(n_folds):
folded.add_circuit(circuit.adjoint())
folded.add_circuit(circuit)
return folded
def add_noise(circuit, p):
noisy = circuit.copy()
noisy.apply_gate_noise(Noise.Depolarizing(probability=p))
return noisy
device = LocalSimulator("density_matrix")
noise_prob = 0.01
scale_factors = [1, 3, 5]
n_repetitions = 20 # Repeat each scale factor for variance estimation
shots_per_run = 1000
# Collect results across repetitions
all_results = {sf: [] for sf in scale_factors}
for sf in scale_factors:
folded = fold_circuit(base_circuit(), sf)
noisy = add_noise(folded, noise_prob)
noisy.expectation(Observable.Z() @ Observable.Z(), target=[0, 3])
for rep in range(n_repetitions):
task = device.run(noisy, shots=shots_per_run)
result = task.result()
ev = result.values[0]
all_results[sf].append(ev)
# Compute means and standard errors
means = []
std_errors = []
for sf in scale_factors:
vals = np.array(all_results[sf])
mean = np.mean(vals)
se = np.std(vals, ddof=1) / np.sqrt(n_repetitions)
means.append(mean)
std_errors.append(se)
print(f"Scale factor {sf}: <Z0 Z3> = {mean:.6f} +/- {se:.6f}")
# Linear extrapolation with uncertainty propagation
coeffs, cov = np.polyfit(scale_factors, means, deg=1, cov=True)
zne_linear = coeffs[1]
zne_linear_err = np.sqrt(cov[1][1])
# Richardson extrapolation
V = np.vander(scale_factors, increasing=True)[:, :len(scale_factors)]
V_inv = np.linalg.inv(V)
richardson_coeffs = V_inv @ np.array(means)
zne_richardson = richardson_coeffs[0]
# Propagate uncertainty through the Richardson inversion
se_array = np.array(std_errors)
zne_richardson_err = np.sqrt(np.sum((V_inv[0, :] * se_array) ** 2))
print(f"\nLinear ZNE: {zne_linear:.6f} +/- {zne_linear_err:.6f}")
print(f"Richardson ZNE: {zne_richardson:.6f} +/- {zne_richardson_err:.6f}")
print(f"Ideal value: 1.000000")
# Visualization of the extrapolation curve
print("\nExtrapolation plot (scale factor vs expectation value):")
print(" lambda | E(lambda)")
print(" -------|-----------")
for sf, m, se in zip(scale_factors, means, std_errors):
bar = "=" * int(max(0, m * 40))
print(f" {sf:5d} | {m:.4f} +/- {se:.4f} {bar}")
print(f" {0:5d} | {zne_linear:.4f} (linear extrapolation)")
print(f" {0:5d} | {zne_richardson:.4f} (Richardson extrapolation)")
Resource Estimation: Budgeting Shots for ZNE
Before running ZNE on paid QPU time, calculate the total shot budget. With k scale factors and s shots per scale factor, the total cost is k * s shots for a single ZNE estimate. If you also run r repetitions for error bars, the total is k * s * r.
Example budget for the setup above:
| Component | Count |
|---|---|
| Scale factors | 3 (lambda = 1, 3, 5) |
| Shots per run | 1,000 |
| Repetitions | 20 |
| Total shots | 60,000 |
At Braket’s pricing (typically 0.00145 per shot on Rigetti), 60 tasks at 1,000 shots each costs roughly 60 * 0.00145 = 87 = $105. That is for a single observable on a 4-qubit circuit.
Practical guidelines for shot allocation:
- Minimum reliable shots per scale factor: 1,000-4,000 shots for expectation values with precision to ~0.01-0.02. The standard error of a Z-basis expectation value from s shots is roughly 1/sqrt(s), so 1,000 shots gives ~0.03 precision.
- Number of scale factors: 3 is the minimum (supports quadratic fit). 5 scale factors (1, 3, 5, 7, 9) give a more robust fit but cost 67% more. Using more than 5 rarely helps because high scale factors produce nearly random outputs.
- Repetitions for error bars: 10-30 repetitions are sufficient for estimating the standard error of the mean. On simulators with shot noise, 20 repetitions is a reasonable default.
- Diminishing returns: doubling the shots per scale factor only reduces the standard error by a factor of sqrt(2) = 1.41. Investing in more scale factors or better extrapolation models is often more cost-effective than brute-force shot increases.
Measurement Error Mitigation
Readout errors flip measurement outcomes independently of the quantum state. We can characterize them by preparing known computational basis states and measuring the confusion rates.
Full Calibration Matrix
The calibration matrix M has entries M[i][j] = P(measure i | prepared j). For n qubits, this is a 2^n by 2^n matrix. To correct raw measurement probabilities p_raw, we compute p_corrected = M^(-1) * p_raw (using the pseudoinverse to handle numerical stability).
def build_calibration_matrix(device, n_qubits, shots=10000):
"""Measure each computational basis state to build the readout calibration matrix."""
n_states = 2 ** n_qubits
cal_matrix = np.zeros((n_states, n_states))
for state_idx in range(n_states):
# Prepare |state_idx> by applying X gates
circ = Circuit()
bits = format(state_idx, f'0{n_qubits}b')
for q, b in enumerate(bits):
if b == '1':
circ.x(q)
# Add readout noise
circ.apply_readout_noise(Noise.BitFlip(probability=0.02))
task = device.run(circ, shots=shots)
counts = task.result().measurement_counts
for outcome, count in counts.items():
outcome_idx = int(outcome, 2)
cal_matrix[outcome_idx][state_idx] = count / shots
return cal_matrix
# Build and invert the calibration matrix
sim = LocalSimulator()
cal_matrix = build_calibration_matrix(sim, n_qubits=4, shots=10000)
cal_inverse = np.linalg.pinv(cal_matrix)
# Apply correction to raw measurement distribution
raw_circuit = base_circuit()
raw_circuit.apply_readout_noise(Noise.BitFlip(probability=0.02))
task = sim.run(raw_circuit, shots=10000)
raw_counts = task.result().measurement_counts
# Convert counts to probability vector
n_states = 16
raw_probs = np.zeros(n_states)
for outcome, count in raw_counts.items():
raw_probs[int(outcome, 2)] = count / 10000
mitigated_probs = cal_inverse @ raw_probs
# Clip negative values (physical constraint)
mitigated_probs = np.clip(mitigated_probs, 0, None)
mitigated_probs /= mitigated_probs.sum()
print("Raw vs mitigated probabilities for |0000> and |1111>:")
print(f" |0000>: raw={raw_probs[0]:.4f}, mitigated={mitigated_probs[0]:.4f}")
print(f" |1111>: raw={raw_probs[15]:.4f}, mitigated={mitigated_probs[15]:.4f}")
The Scaling Problem: Tensored Mitigation
The full calibration matrix has 2^n rows and 2^n columns. For 10 qubits, that is 1,024 x 1,024, requiring 1,024 calibration circuits. For 20 qubits, it is over a million calibration circuits, which is completely impractical.
The tensored mitigation approach solves this by assuming readout errors are independent across qubits. Instead of one 2^n x 2^n matrix, you build n separate 2x2 matrices (one per qubit) and take the tensor product. This reduces the calibration cost from 2^n circuits to 2n circuits.
Each per-qubit calibration matrix has the form:
M_qubit = [[P(0|0), P(0|1)],
[P(1|0), P(1|1)]]
where P(measured|prepared) is the probability of measuring a given outcome when the qubit was prepared in a known state.
def build_tensored_calibration(device, n_qubits, shots=10000):
"""Build per-qubit 2x2 calibration matrices.
Requires only 2*n_qubits calibration circuits instead of 2^n_qubits.
"""
qubit_matrices = []
for q in range(n_qubits):
cal_2x2 = np.zeros((2, 2))
for prep_state in [0, 1]:
circ = Circuit()
# Prepare the target qubit
if prep_state == 1:
circ.x(q)
# Other qubits stay in |0>
# Add readout noise to the target qubit only
circ.apply_readout_noise(Noise.BitFlip(probability=0.02))
task = device.run(circ, shots=shots)
counts = task.result().measurement_counts
# Extract the target qubit's measurement statistics
for outcome, count in counts.items():
# outcome is a bitstring over all qubits
measured_bit = int(outcome[q])
cal_2x2[measured_bit][prep_state] += count / shots
qubit_matrices.append(cal_2x2)
return qubit_matrices
def tensored_inverse(qubit_matrices):
"""Compute the inverse of the tensor product of per-qubit calibration matrices.
The inverse of a tensor product is the tensor product of the inverses.
"""
inverses = [np.linalg.inv(m) for m in qubit_matrices]
# Build the full inverse by taking the tensor product
result = inverses[0]
for inv_m in inverses[1:]:
result = np.kron(result, inv_m)
return result
# Example usage
sim = LocalSimulator()
qubit_cals = build_tensored_calibration(sim, n_qubits=4, shots=10000)
# Print per-qubit calibration matrices
for q, m in enumerate(qubit_cals):
print(f"Qubit {q} calibration matrix:")
print(f" P(0|0)={m[0,0]:.4f} P(0|1)={m[0,1]:.4f}")
print(f" P(1|0)={m[1,0]:.4f} P(1|1)={m[1,1]:.4f}")
# Apply tensored correction
tensored_inv = tensored_inverse(qubit_cals)
mitigated_probs_tensored = tensored_inv @ raw_probs
mitigated_probs_tensored = np.clip(mitigated_probs_tensored, 0, None)
mitigated_probs_tensored /= mitigated_probs_tensored.sum()
print("\nTensored mitigation results:")
print(f" |0000>: mitigated={mitigated_probs_tensored[0]:.4f}")
print(f" |1111>: mitigated={mitigated_probs_tensored[15]:.4f}")
The tradeoff: tensored mitigation cannot capture correlated readout errors (where the measurement of qubit 0 affects qubit 1). On most current hardware, correlations are small enough that tensored mitigation performs nearly as well as the full matrix approach, at a fraction of the cost.
Checking for Built-in Readout Mitigation
Some Braket QPU devices offer native readout error mitigation at the hardware or firmware level. You can check whether a device supports this by inspecting its properties:
from braket.aws import AwsDevice
device = AwsDevice("arn:aws:braket:us-east-1::device/qpu/ionq/Aria-1")
# Check device action properties for error mitigation support
action_props = device.properties.action
if hasattr(action_props, 'braket.ir.openqasm.program'):
program_props = action_props['braket.ir.openqasm.program']
print(f"Supported result types: {program_props.supportedResultTypes}")
# IonQ devices support error mitigation through their "debias" option
# Check device-specific capabilities
print(f"Device capabilities: {device.properties.provider}")
When a device provides native readout mitigation, you may still benefit from applying your own, but be careful not to double-correct. If the device has already applied readout correction to the returned counts, applying your calibration matrix on top will introduce bias. Always check the device documentation and provider metadata to understand what corrections are applied automatically.
Combining ZNE and Readout Mitigation
Both techniques address different error sources: ZNE handles gate errors and decoherence, while readout mitigation handles measurement errors. They compose naturally, but the order matters.
Correct order:
- Collect raw measurement counts from the quantum device.
- Apply readout error correction to the raw counts (classical post-processing on the probability distribution).
- Compute the expectation value from the corrected counts.
- Repeat steps 1-3 for each noise scale factor.
- Apply ZNE extrapolation to the corrected expectation values.
Common mistake: computing the expectation value from raw counts first, then trying to correct readout errors on the expectation value. Readout correction operates on the full probability distribution, not on a scalar. Once you collapse the distribution to a single expectation value, the information needed for readout correction is lost.
def expectation_from_counts(counts, observable_qubits, n_qubits, total_shots):
"""Compute <Z^n> expectation value from measurement counts.
For Z-basis observables, each bitstring contributes +1 or -1
based on the parity of the measured bits on the target qubits.
"""
ev = 0.0
for bitstring, count in counts.items():
# Compute parity of the observable qubits
parity = 0
for q in observable_qubits:
parity += int(bitstring[q])
sign = (-1) ** parity
ev += sign * count
return ev / total_shots
def zne_with_readout_correction(
base_circ, scale_factors, noise_prob, readout_prob,
cal_inverse, shots=1000
):
"""Full ZNE pipeline with readout error correction at each scale factor."""
device = LocalSimulator()
corrected_evs = []
for sf in scale_factors:
# Step 1: Fold and add noise
folded = fold_circuit(base_circ, sf)
noisy = folded.copy()
noisy.apply_gate_noise(Noise.Depolarizing(probability=noise_prob))
noisy.apply_readout_noise(Noise.BitFlip(probability=readout_prob))
# Step 2: Run and get raw counts
task = device.run(noisy, shots=shots)
raw_counts = task.result().measurement_counts
# Step 3: Convert to probability vector and apply readout correction
n_qubits = 4
n_states = 2 ** n_qubits
raw_probs = np.zeros(n_states)
for outcome, count in raw_counts.items():
raw_probs[int(outcome, 2)] = count / shots
corrected_probs = cal_inverse @ raw_probs
corrected_probs = np.clip(corrected_probs, 0, None)
corrected_probs /= corrected_probs.sum()
# Step 4: Compute expectation value from corrected distribution
# For <Z0 Z3>: parity of bits 0 and 3
ev = 0.0
for state_idx in range(n_states):
bits = format(state_idx, f'0{n_qubits}b')
parity = int(bits[0]) + int(bits[3])
sign = (-1) ** parity
ev += sign * corrected_probs[state_idx]
corrected_evs.append(ev)
print(f"Scale factor {sf}: corrected <Z0 Z3> = {ev:.6f}")
# Step 5: ZNE extrapolation on corrected values
coeffs = np.polyfit(scale_factors, corrected_evs, deg=1)
zne_linear = coeffs[1]
richardson = richardson_extrapolation(scale_factors, corrected_evs)
print(f"\nZNE + readout correction (linear): {zne_linear:.6f}")
print(f"ZNE + readout correction (Richardson): {richardson:.6f}")
return zne_linear, richardson
# Run the combined pipeline
sim = LocalSimulator()
cal_matrix = build_calibration_matrix(sim, n_qubits=4, shots=10000)
cal_inv = np.linalg.pinv(cal_matrix)
zne_with_readout_correction(
base_circuit(),
scale_factors=[1, 3, 5],
noise_prob=0.01,
readout_prob=0.02,
cal_inverse=cal_inv,
shots=2000
)
Comparison of Error Mitigation Techniques
| Technique | Shot overhead | Accuracy | Assumptions | Best use case |
|---|---|---|---|---|
| Zero-Noise Extrapolation (ZNE) | k x base shots (k = number of scale factors) | Good for weak-to-moderate noise | Noise scales predictably with folding; per-gate error is small | Expectation value estimation on circuits with moderate depth |
| Measurement Error Mitigation | 2^n or 2n calibration circuits (full vs. tensored) | Excellent for readout errors | Readout errors are stable over time; independent per qubit (for tensored) | Any circuit where readout error is a significant contributor |
| Probabilistic Error Cancellation (PEC) | Exponential in circuit depth: O(exp(2epsilond)) shots | Unbiased estimator | Full knowledge of noise model required; noise must be invertible | High-accuracy estimation when noise model is well characterized |
| Symmetry Verification | 2-3x base shots | Moderate | Circuit has known symmetries (e.g., particle number conservation) | Variational algorithms with conserved quantities |
ZNE and readout mitigation are the most accessible techniques because they require no knowledge of the underlying noise model (ZNE) or only simple calibration (readout). PEC provides unbiased estimates but is exponentially expensive. Symmetry verification is cheap but only applicable when the target state has exploitable symmetries.
Common Mistakes in Error Mitigation
1. Folding Too Aggressively
Using large scale factors (7, 9, 11, …) seems like it would give more data for extrapolation, but the signal degrades exponentially. At scale factor 9 with 1% depolarizing noise per gate, a 4-gate circuit effectively has 36 noisy gates, giving a fidelity of roughly 0.99^36 = 69.5%. The expectation value at that noise level is so far from the ideal that it contributes mostly noise, not signal, to the extrapolation fit.
Rule of thumb: stop increasing the scale factor when the expectation value drops below ~30% of its ideal magnitude. Beyond that point, the measurement is dominated by statistical noise rather than the physical signal you are trying to extrapolate.
2. Assuming Noise Scales Linearly Without Verification
ZNE extrapolation assumes that the noise-expectation curve is smooth and predictable. Before trusting the extrapolated value, plot E(lambda) against lambda. If the curve has kinks, plateaus, or non-monotonic behavior, the extrapolation will be unreliable.
A simple diagnostic: run at 5 or more scale factors and check whether the data is well-fit by your chosen model (linear, quadratic, exponential). Compute the residuals. If the residuals show systematic patterns rather than random scatter, the model is wrong.
# Diagnostic: check residuals of the linear fit
coeffs = np.polyfit(scale_factors, means, deg=1)
fitted = np.polyval(coeffs, scale_factors)
residuals = np.array(means) - fitted
print("Residuals of linear fit:")
for sf, r in zip(scale_factors, residuals):
print(f" lambda={sf}: residual={r:.6f}")
# If residuals are large relative to std_errors, the linear model is insufficient
for sf, r, se in zip(scale_factors, residuals, std_errors):
if abs(r) > 2 * se:
print(f" WARNING: residual at lambda={sf} exceeds 2*stderr, "
f"linear model may be inadequate")
3. Insufficient Calibration Shots
The calibration matrix itself is estimated from finite shots, introducing statistical error. If you use 1,000 shots per calibration state, each entry in the matrix has a standard error of roughly sqrt(p*(1-p)/1000), where p is the true confusion probability. For p=0.02, that is sqrt(0.02*0.98/1000) = 0.0044, meaning the 2% readout error rate is known only to within +/- 0.4%.
This uncertainty propagates through the matrix inversion and into the corrected probabilities. For the full 2^n calibration matrix, matrix inversion amplifies small errors, especially when the matrix is nearly singular (which happens when readout errors are large).
Guideline: use at least 5,000 shots per calibration state, and 10,000+ if high precision is needed. Verify by running the calibration twice and checking that the resulting matrices agree within statistical uncertainty.
4. Exceeding Device Coherence Times
Circuit folding at scale factor s multiplies the circuit depth by roughly s. If the original circuit takes 5 microseconds to execute, a scale factor of 7 produces a circuit that takes 35 microseconds. On hardware with T2 = 50 microseconds, the folded circuit loses 50% of its coherence to dephasing alone, violating the assumption that gate errors dominate.
Before running ZNE on hardware, estimate the folded circuit duration and compare it to the device’s T1 and T2 times. If the folded circuit duration exceeds T2/3 at your largest scale factor, the decoherence contribution will distort the extrapolation. In that case, use fewer or smaller scale factors, or switch to partial folding (folding only two-qubit gates) to reduce the depth increase.
5. Double-Applying Readout Correction
When combining readout mitigation with a device that has native error mitigation enabled, the corrections can conflict. Native mitigation already adjusts the returned counts; applying your calibration matrix on top introduces a systematic bias that pushes probabilities away from the true values.
Always verify what corrections the device applies by default. On Braket, check the device documentation and the device.properties.provider metadata. If native mitigation is enabled and you cannot disable it, skip the calibration matrix step and apply only ZNE.
Key Takeaways
- NISQ errors come from three sources: gate errors (0.5-1% per two-qubit gate), decoherence (T1/T2 timescales), and readout errors (1-5% per qubit). They compound multiplicatively across a circuit.
- ZNE amplifies noise via circuit folding and extrapolates to the noiseless limit, requiring no knowledge of the noise model. The mathematical basis is a Taylor expansion of E(lambda) around lambda=0.
- Gate folding (G to GG_dagG) is the standard noise amplification approach on Braket. Use verbatim compilation to prevent the transpiler from canceling your inserted gates.
- Linear, Richardson, and exponential extrapolation each suit different noise regimes. Always verify the fit quality before trusting the extrapolated value.
- Measurement error mitigation corrects readout biases using a calibration matrix. For circuits with more than ~10 qubits, use tensored mitigation (2n calibration circuits instead of 2^n).
- When combining ZNE and readout mitigation, apply readout correction first (on the probability distribution), then compute expectation values, then extrapolate.
- Budget your shots carefully: ZNE with 3 scale factors at 1,000 shots each costs 3,000 shots minimum, and meaningful error bars require 10-30 repetitions per scale factor.
- Avoid common pitfalls: excessive folding that drowns the signal, unverified linearity assumptions, insufficient calibration shots, and circuits that exceed hardware coherence times.
Was this tutorial helpful?