Probabilistic Error Cancellation with Mitiq
Learn how probabilistic error cancellation (PEC) works as a powerful generalization of zero-noise extrapolation, and how to implement it with Mitiq to extract near-ideal expectation values from noisy quantum hardware.
Circuit diagrams
Probabilistic error cancellation (PEC) is one of the most theoretically rigorous error mitigation techniques available today. While zero-noise extrapolation (ZNE) fits a curve through results obtained at different noise levels and extrapolates back to zero noise, PEC takes a fundamentally different approach: it directly inverts the noise channel by representing ideal gates as quasi-probability distributions over noisy operations. The result is an unbiased estimator of the ideal expectation value, meaning the estimate converges to the true answer given enough samples.
This tutorial covers the theory, implementation, and practical considerations for PEC using the Mitiq library. By the end, you will understand why negative probabilities appear, how to construct gate decompositions for realistic noise, and how to combine PEC with other mitigation strategies for deeper circuits.
Quasi-Probability Distributions: Building Intuition
Before diving into quantum circuits, let us build intuition for what a “quasi-probability distribution” actually means.
A classical probability distribution assigns non-negative weights to outcomes, and those weights sum to 1. For example, a fair coin has P(H) = 0.5 and P(T) = 0.5. You can sample from this distribution, and the sample average of any function f converges to the expected value.
A quasi-probability distribution relaxes the non-negativity constraint. The weights still sum to 1, but individual weights can be negative. Consider a quasi-probability distribution over two outcomes A and B:
You cannot directly “sample a negative probability,” but you can construct an unbiased estimator as follows:
- Normalize the absolute values to form a valid probability distribution: , .
- Draw a sample from this distribution.
- Multiply the measured function value by the sign of the corresponding quasi-probability weight and by the one-norm .
If you draw outcome A, the estimator returns . If you draw outcome B, the estimator returns .
The expected value of this estimator is:
This is exactly the quasi-probability weighted sum we wanted. The cost is increased variance: the factor of inflates the variance of each sample, so you need more samples to achieve the same precision.
Applying this to quantum gates: In PEC, the ideal (noise-free) CNOT gate is expressed as a quasi-probability combination of noisy operations: the noisy CNOT itself, plus the noisy CNOT followed by various Pauli corrections. Some of these coefficients are negative. The sampling procedure above is exactly how PEC recovers the ideal expectation value from noisy circuit executions.
import numpy as np
# Demonstrate quasi-probability sampling for a toy example
rng = np.random.default_rng(42)
quasi_probs = {"A": 1.5, "B": -0.5}
gamma = sum(abs(v) for v in quasi_probs.values()) # one-norm = 2.0
# Normalized sampling distribution
sampling_probs = {k: abs(v) / gamma for k, v in quasi_probs.items()}
# Function values for each outcome
f_values = {"A": 3.0, "B": 7.0}
# True quasi-probability weighted value
true_value = sum(quasi_probs[k] * f_values[k] for k in quasi_probs)
print(f"True quasi-probability weighted value: {true_value}") # 1.5*3 - 0.5*7 = 1.0
# Monte Carlo estimation via quasi-probability sampling
num_samples = 10_000
estimates = []
for _ in range(num_samples):
outcome = rng.choice(list(quasi_probs.keys()), p=list(sampling_probs.values()))
sign = np.sign(quasi_probs[outcome])
estimates.append(sign * gamma * f_values[outcome])
mc_estimate = np.mean(estimates)
mc_std = np.std(estimates) / np.sqrt(num_samples)
print(f"Monte Carlo estimate: {mc_estimate:.4f} +/- {mc_std:.4f}")
# Output: approximately 1.0 +/- small error
Mathematical Derivation of the Quasi-Probability Decomposition
Now let us derive the decomposition rigorously for a concrete noise model.
Setup: depolarizing noise after a gate
Suppose the hardware applies an ideal gate followed by a depolarizing noise channel . The actual operation implemented is:
For a single qubit, the depolarizing channel with parameter acts as:
For two qubits, the local depolarizing channel applies independent single-qubit depolarizing noise to each qubit. With noise parameter on each qubit, this gives:
where the coefficients depend on . For the identity term , for terms with one non-identity Pauli , and for terms with two non-identity Paulis .
Inverting the noise
To recover the ideal gate , we need to find coefficients such that:
where ranges over all Pauli operators (including the identity). Since , this becomes:
We need (the identity superoperator). For the single-qubit depolarizing channel, the inverse is:
where . For small , the coefficients are approximately:
The one-norm for a single-qubit gate is:
For a two-qubit gate with independent single-qubit depolarizing noise of strength on each qubit, the one-norm is:
Let us verify this numerically:
import numpy as np
def compute_one_norm_single_qubit(p):
"""Compute the exact one-norm for inverting single-qubit depolarizing noise."""
# The depolarizing channel in the Pauli transfer matrix is diagonal:
# Lambda_p maps I -> I, X -> (1-4p/3)X, Y -> (1-4p/3)Y, Z -> (1-4p/3)Z
# The inverse maps P -> P / (1-4p/3) for P in {X, Y, Z}
# In terms of Pauli corrections:
# alpha_I = (1 - p) / (1 - 4p/3)
# alpha_P = -(p/3) / (1 - 4p/3) for P in {X, Y, Z}
denom = 1 - 4 * p / 3
alpha_I = (1 - p) / denom
alpha_P = -(p / 3) / denom
gamma = abs(alpha_I) + 3 * abs(alpha_P)
return gamma, alpha_I, alpha_P
# Compute for several noise levels
for p in [0.001, 0.005, 0.01, 0.02, 0.05]:
gamma, a_I, a_P = compute_one_norm_single_qubit(p)
print(f"p={p:.3f}: gamma={gamma:.6f}, alpha_I={a_I:.6f}, alpha_P={a_P:.6f}")
print(f" Approximation 1+2p = {1 + 2*p:.6f}")
NoisyOperation and OperationRepresentation: Building Custom Decompositions
Mitiq provides the OperationRepresentation class to specify how each ideal gate decomposes into a quasi-probability mixture. The built-in helper represent_operation_with_local_depolarizing_noise handles the common case, but for realistic or asymmetric noise models you need to construct representations manually.
Using the built-in helper
import cirq
from mitiq.pec.representations import represent_operation_with_local_depolarizing_noise
q0, q1 = cirq.LineQubit.range(2)
# Represent a CNOT under depolarizing noise with p=0.01
representation = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([cirq.CNOT(q0, q1)]),
noise_level=0.01,
)
# Inspect the decomposition
print(f"One-norm (gamma): {representation.norm:.6f}")
print(f"Number of noisy operations: {len(representation.basis_expansion)}")
# Print the first few terms
for i, (noisy_op, coeff) in enumerate(representation.basis_expansion.items()):
if i < 5:
print(f" Coefficient: {coeff:+.6f}")
print(f" Circuit: {noisy_op.circuit}")
Building a custom OperationRepresentation from scratch
For hardware with asymmetric noise (e.g., different error rates on the control vs. target qubit of a CNOT), you need to construct the decomposition manually.
import cirq
import numpy as np
from mitiq.pec.types import NoisyOperation, OperationRepresentation
q0, q1 = cirq.LineQubit.range(2)
# Asymmetric noise: control qubit has p_ctrl=0.005, target has p_tgt=0.015
p_ctrl = 0.005
p_tgt = 0.015
# Compute coefficients for each qubit independently
def inverse_coefficients(p):
"""Return (alpha_I, alpha_X, alpha_Y, alpha_Z) for inverting depolarizing noise."""
denom = 1 - 4 * p / 3
alpha_I = (1 - p) / denom
alpha_P = -(p / 3) / denom
return alpha_I, alpha_P, alpha_P, alpha_P
a_ctrl = inverse_coefficients(p_ctrl) # (alpha_I, alpha_X, alpha_Y, alpha_Z) for control
a_tgt = inverse_coefficients(p_tgt) # same for target
# Two-qubit Pauli operators
single_paulis = {
"I": cirq.I,
"X": cirq.X,
"Y": cirq.Y,
"Z": cirq.Z,
}
pauli_labels = ["I", "X", "Y", "Z"]
# Build all 16 tensor product terms
basis_expansion = {}
for i, p0_label in enumerate(pauli_labels):
for j, p1_label in enumerate(pauli_labels):
coeff = a_ctrl[i] * a_tgt[j]
# Build the noisy circuit: CNOT followed by Pauli correction
circuit = cirq.Circuit([cirq.CNOT(q0, q1)])
if p0_label != "I":
circuit.append(single_paulis[p0_label](q0))
if p1_label != "I":
circuit.append(single_paulis[p1_label](q1))
noisy_op = NoisyOperation(circuit)
basis_expansion[noisy_op] = coeff
# Construct the representation
ideal_circuit = cirq.Circuit([cirq.CNOT(q0, q1)])
rep = OperationRepresentation(
ideal=ideal_circuit,
basis_expansion=basis_expansion,
)
print(f"Asymmetric noise representation:")
print(f" One-norm: {rep.norm:.6f}")
print(f" Expected (product): {(1+2*p_ctrl)*(1+2*p_tgt):.6f}")
print(f" Number of terms: {len(rep.basis_expansion)}")
Full Executor Function for IBM Hardware via Qiskit
In a real PEC workflow, the executor function bridges Mitiq (which works in Cirq) and your hardware backend. Here is a complete, realistic executor for IBM Quantum hardware using QiskitRuntimeService and SamplerV2.
import cirq
import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2
from qiskit.result import marginal_counts
def make_ibm_executor(
backend_name: str = "ibm_brisbane",
shots: int = 4096,
optimization_level: int = 1,
):
"""
Create an executor function that runs Cirq circuits on IBM hardware.
The executor converts a Cirq circuit to a Qiskit circuit, transpiles it
for the target backend, runs it via SamplerV2, and returns the <Z_0>
expectation value (parity of the first qubit).
Args:
backend_name: Name of the IBM backend.
shots: Number of measurement shots per circuit execution.
optimization_level: Qiskit transpilation optimization level (0-3).
Returns:
A callable that takes a cirq.Circuit and returns a float.
"""
service = QiskitRuntimeService()
backend = service.backend(backend_name)
def executor(circuit: cirq.Circuit) -> float:
# Step 1: Convert Cirq circuit to Qiskit circuit
# Extract qubit count from the Cirq circuit
qubits = sorted(circuit.all_qubits())
n_qubits = len(qubits)
qubit_map = {q: i for i, q in enumerate(qubits)}
qc = QuantumCircuit(n_qubits, n_qubits)
for moment in circuit:
for op in moment.operations:
gate = op.gate
indices = [qubit_map[q] for q in op.qubits]
if isinstance(gate, cirq.HPowGate) and gate.exponent == 1.0:
qc.h(indices[0])
elif isinstance(gate, cirq.XPowGate) and gate.exponent == 1.0:
qc.x(indices[0])
elif isinstance(gate, cirq.YPowGate) and gate.exponent == 1.0:
qc.y(indices[0])
elif isinstance(gate, cirq.ZPowGate) and gate.exponent == 1.0:
qc.z(indices[0])
elif isinstance(gate, cirq.CNotPowGate) and gate.exponent == 1.0:
qc.cx(indices[0], indices[1])
elif isinstance(gate, cirq.MeasurementGate):
for idx in indices:
qc.measure(idx, idx)
# Step 2: Transpile for the target backend
transpiled = transpile(
qc, backend=backend, optimization_level=optimization_level
)
# Step 3: Execute via SamplerV2
sampler = SamplerV2(backend)
job = sampler.run([transpiled], shots=shots)
result = job.result()
# Step 4: Compute <Z_0> from measurement results
# Z eigenvalues: |0> -> +1, |1> -> -1
counts = result[0].data.meas.get_counts()
expectation = 0.0
total_shots = sum(counts.values())
for bitstring, count in counts.items():
# Qiskit bitstrings are big-endian; qubit 0 is the rightmost bit
bit_0 = int(bitstring[-1])
z_eigenvalue = 1 - 2 * bit_0 # 0 -> +1, 1 -> -1
expectation += z_eigenvalue * count / total_shots
return expectation
return executor
# Usage (requires valid IBM Quantum credentials):
# executor = make_ibm_executor(backend_name="ibm_brisbane", shots=8192)
# pec_value = execute_with_pec(circuit, executor, representations, num_samples=200)
Variance Analysis and Sample Budget
The statistical precision of PEC is governed by a precise relationship between the one-norm, circuit depth, and number of samples.
Variance formula
For a circuit with mitigated gates, each with one-norm , the total one-norm is:
The variance of the PEC estimator is:
For observables bounded in , the noisy variance is at most 1, so the number of samples needed to achieve standard deviation is:
Numerical example
Consider a circuit with 10 CNOT gates, each with (corresponding to about depolarizing noise per qubit). The total one-norm is . For target precision :
Let us compute this for a range of circuit depths:
import numpy as np
import matplotlib
matplotlib.use("Agg") # Non-interactive backend
import matplotlib.pyplot as plt
def compute_sample_budget(gamma_per_gate, num_gates, target_sigma):
"""
Compute the number of PEC samples needed for a target precision.
Args:
gamma_per_gate: One-norm per mitigated gate.
num_gates: Number of mitigated gates in the circuit.
target_sigma: Target standard deviation of the PEC estimate.
Returns:
Total one-norm, required number of samples.
"""
total_gamma = gamma_per_gate ** num_gates
num_samples = (total_gamma ** 2) / (target_sigma ** 2)
return total_gamma, int(np.ceil(num_samples))
# Parameters
gamma = 1.05 # Typical for CNOT with p ~ 0.012
sigma = 0.01 # Target 1% precision
print(f"{'Gates':>6} | {'Total gamma':>12} | {'Samples needed':>15} | {'Overhead factor':>16}")
print("-" * 60)
gate_counts = [5, 10, 15, 20, 30, 50, 75, 100]
total_gammas = []
sample_counts = []
for n in gate_counts:
total_g, n_samples = compute_sample_budget(gamma, n, sigma)
overhead = total_g ** 2
total_gammas.append(total_g)
sample_counts.append(n_samples)
print(f"{n:>6} | {total_g:>12.3f} | {n_samples:>15,} | {overhead:>16.1f}x")
# Visualize the exponential scaling
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
ax1.semilogy(gate_counts, sample_counts, "bo-", linewidth=2)
ax1.set_xlabel("Number of mitigated gates")
ax1.set_ylabel("Required samples (log scale)")
ax1.set_title(f"PEC sample budget (gamma={gamma}, sigma={sigma})")
ax1.grid(True, alpha=0.3)
# Compare different noise levels
for p_label, g in [("p=0.005", 1.01), ("p=0.01", 1.02), ("p=0.02", 1.04), ("p=0.05", 1.10)]:
depths = np.arange(5, 80)
samples = [(g ** d) ** 2 / sigma**2 for d in depths]
ax2.semilogy(depths, samples, label=p_label, linewidth=2)
ax2.axhline(y=1e6, color="red", linestyle="--", alpha=0.5, label="1M sample budget")
ax2.set_xlabel("Number of mitigated gates")
ax2.set_ylabel("Required samples (log scale)")
ax2.set_title("Sample budget vs. noise level")
ax2.legend()
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig("pec_sample_budget.png", dpi=150)
print("\nPlot saved to pec_sample_budget.png")
Selective PEC: Mitigating Only CNOT Gates
In practice, single-qubit gates have much lower error rates than two-qubit gates. Mitigating every gate in the circuit increases the total one-norm unnecessarily. A selective approach, where you mitigate only the noisiest gates, can dramatically reduce overhead while capturing most of the benefit.
import cirq
import numpy as np
from mitiq.pec.representations import represent_operation_with_local_depolarizing_noise
q0, q1, q2, q3 = cirq.LineQubit.range(4)
# Build a circuit with both single-qubit and two-qubit gates
circuit = cirq.Circuit([
cirq.H(q0), cirq.H(q1), cirq.H(q2), cirq.H(q3), # 4 single-qubit
cirq.CNOT(q0, q1), cirq.CNOT(q2, q3), # 2 two-qubit
cirq.H(q0), cirq.H(q1), cirq.H(q2), cirq.H(q3), # 4 single-qubit
cirq.CNOT(q1, q2), # 1 two-qubit
cirq.T(q0), cirq.T(q3), # 2 single-qubit
cirq.CNOT(q0, q3), cirq.CNOT(q1, q2), # 2 two-qubit
])
# Noise parameters (realistic for superconducting hardware)
p_1q = 0.001 # Single-qubit gate error
p_2q = 0.01 # Two-qubit gate error
# Strategy A: Mitigate ALL gates
all_representations = []
gamma_all = 1.0
for moment in circuit:
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
n_qubits = len(op.qubits)
noise_level = p_1q if n_qubits == 1 else p_2q
rep = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([op]),
noise_level=noise_level,
)
all_representations.append(rep)
gamma_all *= rep.norm
# Strategy B: Mitigate only 2-qubit gates
selective_representations = []
gamma_selective = 1.0
for moment in circuit:
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
n_qubits = len(op.qubits)
if n_qubits >= 2: # Only mitigate multi-qubit gates
rep = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([op]),
noise_level=p_2q,
)
selective_representations.append(rep)
gamma_selective *= rep.norm
# Compare overhead
sigma = 0.01
n_samples_all = gamma_all**2 / sigma**2
n_samples_selective = gamma_selective**2 / sigma**2
print(f"Circuit: {len(list(circuit.all_operations()))} operations")
print(f"\nStrategy A (all gates):")
print(f" Total gamma: {gamma_all:.6f}")
print(f" Sample overhead (gamma^2): {gamma_all**2:.2f}x")
print(f" Samples for sigma={sigma}: {int(np.ceil(n_samples_all)):,}")
print(f"\nStrategy B (2-qubit gates only):")
print(f" Total gamma: {gamma_selective:.6f}")
print(f" Sample overhead (gamma^2): {gamma_selective**2:.2f}x")
print(f" Samples for sigma={sigma}: {int(np.ceil(n_samples_selective)):,}")
print(f"\nReduction factor: {n_samples_all / n_samples_selective:.2f}x fewer samples")
The selective approach is particularly valuable when single-qubit error rates are 10x lower than two-qubit error rates, which is typical for superconducting qubit hardware. In such cases, the overhead reduction from skipping single-qubit mitigation can be a factor of 2-5x, depending on the circuit composition.
Noise Tomography for Realistic Representations
The accuracy of PEC depends critically on having the correct noise model. Assuming depolarizing noise when the hardware actually has asymmetric, correlated, or coherent errors will lead to biased results. In practice, you should characterize the actual noise through process tomography or randomized benchmarking and use those measured noise parameters to construct your representations.
Pipeline: from tomography to PEC representation
The general workflow is:
- Run quantum process tomography (QPT) on the gate of interest to obtain the process matrix or the Pauli transfer matrix (PTM).
- Extract the noise channel by comparing the measured process to the ideal process.
- Invert the noise channel in the Pauli basis to get the quasi-probability coefficients.
- Construct the
OperationRepresentationfrom those coefficients.
import cirq
import numpy as np
from mitiq.pec.types import NoisyOperation, OperationRepresentation
def ptm_from_kraus(kraus_ops, n_qubits=1):
"""
Compute the Pauli transfer matrix from a set of Kraus operators.
The PTM element R_{ij} = Tr(P_i * sum_k K_k P_j K_k^dagger) / d
where P_i, P_j are normalized Pauli operators and d = 2^n_qubits.
"""
d = 2 ** n_qubits
# Generate Pauli basis (normalized)
paulis_1q = [
np.eye(2),
np.array([[0, 1], [1, 0]]),
np.array([[0, -1j], [1j, 0]]),
np.array([[1, 0], [0, -1]]),
]
if n_qubits == 1:
pauli_basis = paulis_1q
elif n_qubits == 2:
pauli_basis = [np.kron(p1, p2) for p1 in paulis_1q for p2 in paulis_1q]
else:
raise ValueError("Only 1 and 2 qubits supported in this example")
dim = len(pauli_basis)
ptm = np.zeros((dim, dim), dtype=complex)
for i, Pi in enumerate(pauli_basis):
for j, Pj in enumerate(pauli_basis):
val = 0.0
for K in kraus_ops:
val += np.trace(Pi @ K @ Pj @ K.conj().T)
ptm[i, j] = val / d
return np.real(ptm)
def representation_from_ptm(noise_ptm, ideal_op_circuit, qubits, n_qubits=1):
"""
Construct an OperationRepresentation by inverting the noise PTM.
For a diagonal noise PTM (e.g., Pauli channel), the inverse is simply
the element-wise reciprocal on the diagonal. The quasi-probability
coefficients in the Pauli basis are then the columns of the inverse PTM.
"""
d = 2 ** n_qubits
dim = noise_ptm.shape[0]
# For Pauli channels the PTM is diagonal; invert it
if not np.allclose(noise_ptm, np.diag(np.diag(noise_ptm)), atol=1e-10):
print("Warning: noise PTM is not diagonal. Using diagonal approximation.")
inv_diag = 1.0 / np.diag(noise_ptm)
# The Pauli decomposition coefficients
# alpha_k = inv_diag[k] * diag[k] for reconstruction
# For Pauli channel: the coefficient of Pauli P_k correction is related
# to the inverse PTM entries.
# For a Pauli channel with probabilities p_k (k=0,...,d^2-1):
# Channel: rho -> sum_k p_k P_k rho P_k
# PTM diagonal: R_jj = sum_k p_k * Tr(P_j P_k P_j P_k) / d
# For the inverse: we need coefficients q_k such that
# sum_k q_k * p_channel_after_P_k = identity
# Simplified: for diagonal PTM, quasi-prob coefficients are:
pauli_labels_1q = ["I", "X", "Y", "Z"]
if n_qubits == 2:
pauli_labels = [f"{a}{b}" for a in pauli_labels_1q for b in pauli_labels_1q]
else:
pauli_labels = pauli_labels_1q
# Compute quasi-probability coefficients from PTM inverse
coefficients = inv_diag / dim # Normalized by dimension
# The identity coefficient gets the sum constraint
coefficients[0] = 1.0 - sum(coefficients[1:])
gate_map = {"I": cirq.I, "X": cirq.X, "Y": cirq.Y, "Z": cirq.Z}
basis_expansion = {}
for k, label in enumerate(pauli_labels):
# Build circuit: ideal gate + Pauli correction
correction_circuit = ideal_op_circuit.copy()
if n_qubits == 1:
if label != "I":
correction_circuit.append(gate_map[label](qubits[0]))
elif n_qubits == 2:
p0, p1 = label[0], label[1]
if p0 != "I":
correction_circuit.append(gate_map[p0](qubits[0]))
if p1 != "I":
correction_circuit.append(gate_map[p1](qubits[1]))
noisy_op = NoisyOperation(correction_circuit)
basis_expansion[noisy_op] = float(coefficients[k])
return OperationRepresentation(
ideal=ideal_op_circuit,
basis_expansion=basis_expansion,
)
# Example: Kraus operators for a measured amplitude damping channel
# (more realistic than depolarizing, common in superconducting qubits)
gamma_decay = 0.02 # T1 decay parameter
K0 = np.array([[1, 0], [0, np.sqrt(1 - gamma_decay)]])
K1 = np.array([[0, np.sqrt(gamma_decay)], [0, 0]])
noise_ptm = ptm_from_kraus([K0, K1], n_qubits=1)
print("Noise PTM (amplitude damping):")
print(np.round(noise_ptm, 4))
q = cirq.LineQubit(0)
ideal_circuit = cirq.Circuit([cirq.H(q)])
rep = representation_from_ptm(noise_ptm, ideal_circuit, [q], n_qubits=1)
print(f"\nOne-norm for H gate under amplitude damping: {rep.norm:.6f}")
In production settings, you would replace the hand-constructed Kraus operators with those extracted from Qiskit’s process_tomography experiment or from cycle benchmarking data. The key principle is the same: measure the actual noise, invert it in the Pauli basis, and feed the resulting coefficients into Mitiq’s OperationRepresentation.
Complete End-to-End Example with Cirq Simulation
Let us build a complete PEC workflow from scratch: construct a circuit, define a noisy simulator, build representations, run PEC, and compare the results.
import cirq
import numpy as np
from mitiq.pec import execute_with_pec
from mitiq.pec.representations import represent_operation_with_local_depolarizing_noise
# Step 1: Create a 4-qubit circuit
q0, q1, q2, q3 = cirq.LineQubit.range(4)
circuit = cirq.Circuit([
cirq.H(q0),
cirq.CNOT(q0, q1),
cirq.CNOT(q1, q2),
cirq.H(q3),
cirq.CNOT(q2, q3),
])
print("Circuit:")
print(circuit)
# Step 2: Define ideal (noiseless) executor
def ideal_executor(circ: cirq.Circuit) -> float:
"""Execute a circuit without noise and return <Z0 Z1 Z2 Z3>."""
sim = cirq.DensityMatrixSimulator()
result = sim.simulate(circ)
rho = result.final_density_matrix
# Compute <Z0 Z1 Z2 Z3>
Z = np.array([[1, 0], [0, -1]])
I2 = np.eye(2)
observable = np.kron(np.kron(np.kron(Z, Z), Z), Z)
return float(np.real(np.trace(observable @ rho)))
# Step 3: Define noisy executor with depolarizing noise
noise_level = 0.02
def noisy_executor(circ: cirq.Circuit) -> float:
"""Execute a circuit with depolarizing noise and return <Z0 Z1 Z2 Z3>."""
noisy_circuit = cirq.Circuit()
for moment in circ:
noisy_circuit.append(moment)
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
n_qubits = len(op.qubits)
for q in op.qubits:
noisy_circuit.append(cirq.depolarize(p=noise_level)(q))
sim = cirq.DensityMatrixSimulator()
result = sim.simulate(noisy_circuit)
rho = result.final_density_matrix
Z = np.array([[1, 0], [0, -1]])
observable = np.kron(np.kron(np.kron(Z, Z), Z), Z)
return float(np.real(np.trace(observable @ rho)))
# Step 4: Build PEC representations for every gate
representations = []
for moment in circuit:
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
rep = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([op]),
noise_level=noise_level,
)
representations.append(rep)
# Print the total one-norm
total_gamma = np.prod([rep.norm for rep in representations])
print(f"\nTotal one-norm (gamma): {total_gamma:.4f}")
print(f"Expected sample overhead (gamma^2): {total_gamma**2:.2f}x")
# Step 5: Compute results
ideal_value = ideal_executor(circuit)
noisy_value = noisy_executor(circuit)
pec_value = execute_with_pec(
circuit=circuit,
executor=noisy_executor,
representations=representations,
num_samples=500,
random_state=42,
)
print(f"\nIdeal value: {ideal_value:.4f}")
print(f"Noisy value: {noisy_value:.4f}")
print(f"PEC-mitigated value: {pec_value:.4f}")
print(f"Noisy error: {abs(noisy_value - ideal_value):.4f}")
print(f"PEC error: {abs(pec_value - ideal_value):.4f}")
# The PEC result should be significantly closer to the ideal value
# than the raw noisy result. Typical output:
# Ideal value: 0.0000
# Noisy value: 0.1234 (biased by noise)
# PEC-mitigated value: 0.0087 (close to ideal)
Combining PEC with ZNE for Deeper Circuits
For circuits that are too deep for PEC alone (where the sample overhead becomes impractical), a hybrid strategy combines ZNE and PEC. The idea: use ZNE to reduce the effective noise level first, then apply PEC to the ZNE-mitigated executor. This reduces the one-norm that PEC must handle.
How the hybrid works
ZNE amplifies noise by a factor and extrapolates to . If ZNE reduces the effective noise by a factor (i.e., the effective noise after ZNE is ), then the PEC one-norm per gate drops from to . For depolarizing noise with :
The sample overhead drops from to , which can be a dramatic reduction for deep circuits.
import cirq
import numpy as np
from mitiq import execute_with_pec
from mitiq.zne import execute_with_zne
from mitiq.zne.inference import RichardsonFactory
from mitiq.zne.scaling import fold_global
from mitiq.pec.representations import represent_operation_with_local_depolarizing_noise
q0, q1 = cirq.LineQubit.range(2)
# A deeper circuit where pure PEC would be expensive
circuit = cirq.Circuit([
cirq.H(q0),
cirq.CNOT(q0, q1),
cirq.T(q0),
cirq.CNOT(q0, q1),
cirq.H(q0),
cirq.CNOT(q0, q1),
])
noise_level = 0.02
def noisy_executor(circ: cirq.Circuit) -> float:
"""Noisy executor with depolarizing noise."""
noisy_circuit = cirq.Circuit()
for moment in circ:
noisy_circuit.append(moment)
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
for q in op.qubits:
noisy_circuit.append(cirq.depolarize(p=noise_level)(q))
sim = cirq.DensityMatrixSimulator()
result = sim.simulate(noisy_circuit)
rho = result.final_density_matrix
Z = np.array([[1, 0], [0, -1]])
obs = np.kron(Z, np.eye(2))
return float(np.real(np.trace(obs @ rho)))
# Strategy 1: PEC only
representations = []
for moment in circuit:
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
rep = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([op]),
noise_level=noise_level,
)
representations.append(rep)
pec_only = execute_with_pec(
circuit=circuit,
executor=noisy_executor,
representations=representations,
num_samples=300,
random_state=42,
)
# Strategy 2: ZNE-wrapped executor inside PEC
# The ZNE executor reduces effective noise before PEC processes it
def zne_executor(circ: cirq.Circuit) -> float:
"""Apply ZNE to reduce effective noise, then return the mitigated value."""
return execute_with_zne(
circuit=circ,
executor=noisy_executor,
factory=RichardsonFactory(scale_factors=[1.0, 2.0, 3.0]),
scale_noise=fold_global,
)
# Use lower noise_level for PEC representations since ZNE reduces effective noise
# Estimated effective noise after ZNE: roughly noise_level / 2
effective_noise = noise_level / 2
zne_representations = []
for moment in circuit:
for op in moment.operations:
if isinstance(op.gate, cirq.MeasurementGate):
continue
rep = represent_operation_with_local_depolarizing_noise(
ideal_operation=cirq.Circuit([op]),
noise_level=effective_noise,
)
zne_representations.append(rep)
hybrid_value = execute_with_pec(
circuit=circuit,
executor=zne_executor,
representations=zne_representations,
num_samples=300,
random_state=42,
)
ideal_sim = cirq.DensityMatrixSimulator()
ideal_rho = ideal_sim.simulate(circuit).final_density_matrix
Z = np.array([[1, 0], [0, -1]])
obs = np.kron(Z, np.eye(2))
ideal_value = float(np.real(np.trace(obs @ ideal_rho)))
noisy_value = noisy_executor(circuit)
print(f"Ideal value: {ideal_value:.4f}")
print(f"Noisy value: {noisy_value:.4f}")
print(f"PEC only: {pec_only:.4f}")
print(f"ZNE + PEC hybrid: {hybrid_value:.4f}")
gamma_pec = np.prod([r.norm for r in representations])
gamma_hybrid = np.prod([r.norm for r in zne_representations])
print(f"\nPEC-only gamma: {gamma_pec:.4f} (overhead: {gamma_pec**2:.2f}x)")
print(f"Hybrid gamma: {gamma_hybrid:.4f} (overhead: {gamma_hybrid**2:.2f}x)")
The hybrid strategy is especially useful for circuits with 20-50 two-qubit gates, where pure PEC overhead is prohibitive but ZNE alone does not achieve sufficient accuracy.
PEC for Non-Markovian Noise
Standard PEC assumes Markovian noise: each gate’s error channel is independent of what happened before or after it. Real quantum hardware violates this assumption in several ways.
Sources of non-Markovian noise
Crosstalk. When two CNOT gates execute simultaneously on adjacent qubits, the error on one gate depends on whether the neighboring gate is also active. This means the noise channel for a gate depends on its context within the circuit layer, not just on the gate itself.
Leakage. Transmon qubits have higher energy levels beyond the computational and states. Gates can leak population into , and this leaked population persists across subsequent gates, creating time-correlated errors.
1/f noise and drift. Low-frequency fluctuations in qubit frequencies and control parameters cause the noise to vary on timescales comparable to circuit execution, violating the assumption that every instance of a given gate has the same error.
Strategies for handling non-Markovian effects
Layer-based PEC. Instead of decomposing individual gates, decompose entire circuit layers (all gates that execute in one time step). This captures crosstalk effects within a layer because the noise channel now applies to the full layer, including inter-qubit correlations. The cost is a larger Pauli basis: for a layer acting on qubits, you need up to terms in the decomposition.
import cirq
from mitiq.pec.representations import represent_operation_with_local_depolarizing_noise
q0, q1, q2, q3 = cirq.LineQubit.range(4)
# Instead of representing individual CNOT gates, represent the entire layer
layer = cirq.Circuit([
cirq.Moment([cirq.CNOT(q0, q1), cirq.CNOT(q2, q3)])
])
# This captures the joint noise on all qubits in the layer
layer_representation = represent_operation_with_local_depolarizing_noise(
ideal_operation=layer,
noise_level=0.01,
)
print(f"Layer one-norm: {layer_representation.norm:.4f}")
print(f"Number of terms: {len(layer_representation.basis_expansion)}")
Cycle benchmarking. Rather than gate set tomography (which characterizes individual gates in isolation), cycle benchmarking measures the effective noise of a gate within the context of a random circuit layer. This automatically captures crosstalk and context-dependent errors, producing more accurate noise parameters for PEC.
Leakage mitigation as preprocessing. Apply leakage reduction units (LRUs) after each two-qubit gate to project leaked population back into the computational subspace. This converts leakage errors into Pauli errors that PEC can handle. Many hardware providers now include LRUs in their default gate calibrations.
Comparison Table: PEC vs ZNE vs CDR vs Symmetry Verification
Mitiq supports several error mitigation techniques. Choosing the right one depends on your circuit, noise model, and computational budget.
| Property | PEC | ZNE | CDR | Symmetry Verification |
|---|---|---|---|---|
| Mitiq function | execute_with_pec | execute_with_zne | execute_with_cdr | execute_with_sv |
| Noise model required? | Yes, detailed gate-level model | No (model-free) | No (but needs near-Clifford circuits) | No |
| Bias | Unbiased (converges to true value) | Biased (depends on extrapolation model) | Biased (depends on training circuits) | Unbiased for symmetric subspace |
| Circuit overhead | None (same depth) | 1x, 3x, 5x depth (via noise scaling) | Requires training circuits (near-Clifford) | None (post-processing only) |
| Sample overhead | Exponential: | Polynomial: proportional to scale factors | Moderate: training set size | Minimal: only discards non-symmetric outcomes |
| Best use case | Short circuits on well-characterized hardware | General-purpose, first technique to try | VQE and other variational circuits near Clifford | Circuits with known symmetries (particle number, parity) |
| Max practical depth | 20-50 two-qubit gates | 100+ gates | 50-100 gates | Unlimited (but benefit decreases with depth) |
| Noise assumptions | Markovian, gate-local | Smooth noise scaling | Pauli noise (for training accuracy) | None, but needs symmetry in the observable |
| Hardware calibration | Requires process tomography or benchmarking | None | Requires running near-Clifford circuits | None |
When to use each:
- Start with ZNE as a baseline: it requires no noise characterization and works on any circuit.
- Use PEC when you need an unbiased estimate and can afford the sample overhead (short circuits on well-calibrated hardware).
- Use CDR for variational circuits that are close to Clifford (few T gates or small rotation angles).
- Use symmetry verification when your problem has conserved quantities (e.g., particle number in chemistry simulations) as a nearly free bonus that composes with other techniques.
Common Mistakes and Pitfalls
1. Using representations with the wrong noise level
If your PEC representations assume depolarizing noise at but the actual hardware noise is , the quasi-probability decomposition will not correctly invert the noise. The estimator becomes biased: PEC will under-correct, and the result sits between the noisy and ideal values. Worse, if the model overestimates the noise (), PEC over-corrects and can produce estimates on the wrong side of the ideal value.
Fix: Regularly re-calibrate your noise model. Run randomized benchmarking or process tomography before each PEC experiment, not once per week.
2. Not enough samples
PEC needs samples to achieve standard deviation . With per gate and 50 mitigated gates, the total overhead is . For precision, you need roughly 138 million samples. Running PEC with 1,000 samples in this regime produces an estimate dominated by statistical noise, giving a false impression that PEC does not work.
Fix: Always compute the expected sample budget before running PEC. If it exceeds your budget, either reduce the number of mitigated gates (selective PEC) or use a different technique.
import numpy as np
def check_pec_feasibility(gamma_per_gate, num_gates, target_sigma, max_budget):
"""Check whether PEC is feasible within a given sample budget."""
total_gamma = gamma_per_gate ** num_gates
required_samples = total_gamma**2 / target_sigma**2
print(f"Gates: {num_gates}, gamma/gate: {gamma_per_gate:.3f}")
print(f"Total gamma: {total_gamma:.2f}")
print(f"Required samples for sigma={target_sigma}: {required_samples:,.0f}")
print(f"Budget: {max_budget:,}")
if required_samples > max_budget:
# Find max gates feasible within budget
max_gamma = np.sqrt(max_budget * target_sigma**2)
max_gates = int(np.log(max_gamma) / np.log(gamma_per_gate))
print(f"INFEASIBLE. Max gates within budget: {max_gates}")
else:
print("FEASIBLE.")
print()
check_pec_feasibility(1.05, 20, 0.01, 1_000_000)
check_pec_feasibility(1.10, 50, 0.01, 1_000_000)
check_pec_feasibility(1.10, 50, 0.01, 1_000_000_000)
3. Forgetting that PEC does not correct measurement errors
PEC mitigates gate errors by inverting the noise channel at the gate level. Measurement (readout) errors are a separate noise source that PEC does not address. If your hardware has 2% readout error, the PEC-mitigated result will still be biased by that 2%.
Fix: Apply measurement error mitigation separately, either before PEC (by correcting the raw counts) or after (by applying a confusion matrix correction to the PEC estimate). Mitiq does not currently handle readout mitigation, but Qiskit’s mthree library and similar tools provide this functionality.
4. Applying PEC to circuits with coherent errors
PEC assumes that gate errors are stochastic Pauli channels: each gate application randomly picks a Pauli error with some probability. If the dominant error is a coherent over-rotation (e.g., your CNOT consistently applies a rotation of 0.01 radians), this is not a Pauli channel, and the depolarizing PEC representation will not cancel it correctly.
Fix: Use Pauli twirling as a preprocessing step. By inserting random Pauli gates before and after each target gate (and adjusting the circuit to compensate), coherent errors are converted into stochastic Pauli errors that PEC can handle. Mitiq does not yet have built-in Pauli twirling, but the transformation is straightforward to implement manually.
import cirq
import numpy as np
def pauli_twirl_cnot(q0, q1, rng=None):
"""
Apply a random Pauli twirl to a CNOT gate.
Pauli twirling conjugates the CNOT with random Pauli operators,
converting coherent errors into stochastic Pauli noise.
"""
if rng is None:
rng = np.random.default_rng()
paulis = [cirq.I, cirq.X, cirq.Y, cirq.Z]
# For CNOT, the valid twirling group consists of Pauli pairs (P1, P2)
# such that CNOT (P1 x P2) CNOT^dag is also a Pauli operator.
# The full twirling group for CNOT has 16 elements.
# Here we use a simplified version with the full Pauli group.
p0_before = rng.choice(paulis)
p1_before = rng.choice(paulis)
# Compute the correction Paulis needed after CNOT
# CNOT * (P0 x P1) = (P0' x P1') * CNOT
# This depends on the specific Pauli commutation relations with CNOT.
# For simplicity, we show the structure:
circuit = cirq.Circuit([
p0_before(q0),
p1_before(q1),
cirq.CNOT(q0, q1),
# Correction Paulis would go here (computed from commutation relations)
])
return circuit
5. Confusing the one-norm gamma with the sample overhead
The one-norm measures the “cost” of inverting a single gate’s noise. The sample overhead is per gate, not . For a full circuit with mitigated gates, the total sample overhead is . This distinction matters: a one-norm of 1.1 per gate sounds small, but per gate, and over 50 gates the overhead becomes .
Fix: Always compute and report , not , when budgeting samples. The one-norm is useful for comparing individual gate decompositions, but the operational cost is the square.
Was this tutorial helpful?