Measuring Quantum Volume: A Step-by-Step Guide with Qiskit

What Is Quantum Volume?

Quantum Volume (QV) is a single-number benchmark developed by IBM that captures the overall capability of a quantum device. Rather than measuring just qubit count or gate fidelity in isolation, QV combines qubit number, gate fidelity, connectivity, and compiler quality into one figure of merit.

A device achieves Quantum Volume 2^n if it can successfully run random circuits of depth n on n qubits. The higher the QV, the more capable the device in a holistic sense.

This tutorial implements QV measurement from scratch using Qiskit Aer, showing exactly what happens at each stage of the protocol.

The Heavy Output Generation Problem

The core of QV is the Heavy Output Generation (HOG) problem. Here is the idea:

Generate a random model circuit of depth n on n qubits.
Compute the ideal output distribution (using a statevector simulator).
Identify the “heavy” outputs: bitstrings whose ideal probability exceeds the median probability.
Run the same circuit on a noisy device and measure the probability that the noisy device produces a heavy output.
If the heavy output probability exceeds 2/3 across many circuits, the device passes at that n.

A quantum device passes QV 2^n if, with high statistical confidence, its heavy output probability is greater than 2/3.

Setup

pip install qiskit qiskit-aer scipy numpy

import numpy as np
from scipy.stats import norm
from qiskit import QuantumCircuit, transpile
from qiskit.quantum_info import random_unitary, Statevector
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error, thermal_relaxation_error

Step 1: Generate a Random QV Circuit

A QV circuit of width and depth n consists of n layers. Each layer applies random two-qubit SU(4) gates to random pairs of qubits.

def random_qv_circuit(n: int, seed: int = None) -> QuantumCircuit:
    """Build a random quantum volume circuit of width and depth n."""
    rng = np.random.default_rng(seed)
    qc = QuantumCircuit(n)
    for _ in range(n):
        # Random permutation of qubit pairs for this layer
        perm = rng.permutation(n)
        for k in range(n // 2):
            q0, q1 = int(perm[2 * k]), int(perm[2 * k + 1])
            # Random SU(4) unitary decomposed into native gates
            u = random_unitary(4, seed=rng.integers(1 << 31))
            qc.unitary(u, [q0, q1])
    return qc

# Example: 4-qubit QV circuit
n = 4
qc = random_qv_circuit(n, seed=42)
print(f"Circuit: {n} qubits, depth {qc.depth()}")

Step 2: Compute the Ideal Output Distribution

Use the statevector simulator to get the exact probability of every bitstring:

def ideal_probabilities(qc: QuantumCircuit) -> np.ndarray:
    """Return the ideal probability distribution for a QV circuit."""
    sv = Statevector(qc)
    probs = sv.probabilities()  # length 2^n array
    return probs

ideal_probs = ideal_probabilities(qc)
print(f"Number of bitstrings: {len(ideal_probs)}")
print(f"Sum of probabilities: {ideal_probs.sum():.6f}")

Step 3: Identify Heavy Outputs

The heavy outputs are bitstrings with probability above the median of the ideal distribution:

def heavy_outputs(ideal_probs: np.ndarray):
    """Return the set of heavy output indices and the median probability."""
    median_prob = np.median(ideal_probs)
    heavy_set = set(np.where(ideal_probs > median_prob)[0])
    return heavy_set, median_prob

heavy_set, median_prob = heavy_outputs(ideal_probs)
print(f"Median probability:       {median_prob:.6f}")
print(f"Number of heavy outputs:  {len(heavy_set)} / {len(ideal_probs)}")
print(f"Heavy output probability: {ideal_probs[list(heavy_set)].sum():.4f}")

By construction, the ideal heavy output probability is always above 0.5. The threshold for passing QV is 2/3 (approximately 0.6667).

Step 4: Build a Noise Model

To simulate a real device, we add depolarizing noise to two-qubit gates and readout error:

def build_noise_model(two_qubit_error_rate: float, readout_error: float = 0.01) -> NoiseModel:
    noise_model = NoiseModel()
    
    # Two-qubit depolarizing error
    error_2q = depolarizing_error(two_qubit_error_rate, 2)
    noise_model.add_all_qubit_quantum_error(error_2q, ["unitary"])
    
    # Single-qubit readout error
    readout = [[1 - readout_error, readout_error],
               [readout_error, 1 - readout_error]]
    noise_model.add_all_qubit_readout_error(readout)
    
    return noise_model

noise_model_low  = build_noise_model(0.005)   # high-quality device
noise_model_med  = build_noise_model(0.02)    # medium-quality device
noise_model_high = build_noise_model(0.06)    # noisy device

Step 5: Run Noisy Simulation and Compute Heavy Output Probability

def run_noisy_circuit(qc: QuantumCircuit, noise_model: NoiseModel,
                      shots: int = 1000) -> dict:
    """Run circuit with noise and return counts dict."""
    sim = AerSimulator(noise_model=noise_model)
    # Decompose unitary gates to basis gates supported by the simulator
    qc_m = qc.copy()
    qc_m.measure_all()
    transpiled = transpile(qc_m, sim, optimization_level=0)
    job = sim.run(transpiled, shots=shots)
    return job.result().get_counts()

def heavy_output_probability(counts: dict, heavy_set: set, n: int) -> float:
    """Fraction of shots that landed on a heavy output bitstring."""
    total_shots = sum(counts.values())
    heavy_shots = 0
    for bitstring, count in counts.items():
        # Qiskit bitstrings are little-endian; convert to integer index
        idx = int(bitstring.replace(" ", ""), 2)
        if idx in heavy_set:
            heavy_shots += count
    return heavy_shots / total_shots

Step 6: Full QV Protocol Across Multiple Trials

The QV protocol averages the heavy output probability over many random circuits to reduce statistical noise:

def measure_quantum_volume(n: int, noise_model: NoiseModel,
                           num_trials: int = 100, shots: int = 1000):
    hop_list = []  # heavy output probabilities per trial
    
    for trial in range(num_trials):
        qc = random_qv_circuit(n, seed=trial)
        ideal_probs = ideal_probabilities(qc)
        heavy_set, _ = heavy_outputs(ideal_probs)
        counts = run_noisy_circuit(qc, noise_model, shots=shots)
        hop = heavy_output_probability(counts, heavy_set, n)
        hop_list.append(hop)
    
    mean_hop = np.mean(hop_list)
    std_hop  = np.std(hop_list) / np.sqrt(num_trials)
    
    # One-sided confidence interval: pass if lower bound > 2/3
    z = 1.645  # 95% one-sided confidence
    lower_bound = mean_hop - z * std_hop
    passes = lower_bound > 2 / 3
    
    return {
        "n": n,
        "mean_hop": mean_hop,
        "std_hop": std_hop,
        "lower_bound": lower_bound,
        "passes": passes,
        "qv": 2 ** n if passes else None,
    }

# Run for n=4 across three noise levels
for label, nm in [("Low noise", noise_model_low),
                  ("Med noise", noise_model_med),
                  ("High noise", noise_model_high)]:
    res = measure_quantum_volume(n=4, noise_model=nm, num_trials=50, shots=500)
    status = "PASS" if res["passes"] else "FAIL"
    print(f"{label}: mean HOP={res['mean_hop']:.3f}, "
          f"lower={res['lower_bound']:.3f}, {status}, "
          f"QV={res['qv']}")

Step 7: Scan QV Score Across Circuit Widths

To find the actual QV of a simulated device, scan increasing values of n until the protocol fails:

def find_quantum_volume(noise_model: NoiseModel, max_n: int = 6,
                        num_trials: int = 50, shots: int = 500):
    qv_score = 1
    for n in range(2, max_n + 1):
        res = measure_quantum_volume(n, noise_model, num_trials, shots)
        print(f"  n={n}: HOP={res['mean_hop']:.3f}, "
              f"lower={res['lower_bound']:.3f}, "
              f"{'PASS' if res['passes'] else 'FAIL'}")
        if res["passes"]:
            qv_score = 2 ** n
        else:
            break
    return qv_score

print("Low noise device:")
qv = find_quantum_volume(noise_model_low, max_n=5)
print(f"Quantum Volume: {qv}\n")

print("High noise device:")
qv = find_quantum_volume(noise_model_high, max_n=5)
print(f"Quantum Volume: {qv}")

Interpreting the Results

Device	Two-qubit error	Expected QV
Low noise	0.5%	16 or higher
Medium noise	2%	4 to 8
High noise	6%	1 to 2

Higher two-qubit error rates push the mean heavy output probability toward 0.5 (random guessing). Once it drops below 2/3 with statistical confidence, the device fails at that width and the QV score is capped.

Why Quantum Volume Doesn’t Scale Forever

Quantum Volume has a practical ceiling that has nothing to do with the hardware being tested. Step 2 above requires computing the exact ideal output distribution with a statevector simulator, and that simulation cost grows exponentially with qubit count. In practice, this typically caps everyday QV measurement somewhere in the mid-20s to low-30s in qubit count, since classically simulating a wider circuit becomes the bottleneck rather than the device itself.

This is why, on devices with 50 or more qubits, vendors and researchers tend to lean on benchmarks that scale better, such as randomized benchmarking, layered/cross-entropy benchmarking, or application-specific mirror circuits, rather than trying to push Quantum Volume further. Use Quantum Volume when you want one holistic number to compare small or medium devices across qubit count, connectivity, and compiler quality at once. Reach for a scalable benchmark instead when you need to characterize a device with many more qubits than a statevector simulator can handle, or when you care about performance on one specific algorithm rather than a generic random-circuit workload.

Key Takeaways

QV measures holistic device quality, not just qubit count or peak gate fidelity.
The HOG problem provides a statistically rigorous pass/fail criterion anchored to 2/3 probability.
Increasing circuit width n stresses both gate fidelity and compiler quality simultaneously.
Noise model parameters (especially two-qubit error rate) are the dominant factor in determining QV on near-term hardware.
The same protocol runs on real hardware by replacing AerSimulator with an IBM Quantum backend via qiskit-ibm-runtime.