Circuit and Gate Errors

  • 1

    "Circuit is too deep" / results are garbage on real hardware

    • Qiskit
    • Hardware
    Cause

    Circuit depth exceeds the hardware's coherence time. Superconducting qubits decohere in roughly 100 microseconds. A circuit with hundreds of gate layers takes longer than that to execute, so qubit state degrades before measurement completes.

    Diagnose first

    Check the depth of the circuit after transpilation, not before. The raw circuit depth is meaningless; what runs on the chip is the transpiled version, which can be much deeper due to SWAP insertion.

    from qiskit import transpile
    
    # Check depth before and after transpilation
    print("Original depth:", qc.depth())
    
    transpiled = transpile(qc, backend=backend, optimization_level=1)
    print("Transpiled depth (opt 1):", transpiled.depth())
    
    transpiled_opt = transpile(qc, backend=backend, optimization_level=3)
    print("Transpiled depth (opt 3):", transpiled_opt.depth())

    Fixes

    • Use optimization_level=3 in transpile(). This applies Sabre routing and gate cancellation. Depth reductions of 30-60% are common.
    • Rewrite multi-qubit gates using only native gates (cx, u, ecr depending on backend). Each non-native gate decomposes into several native gates, multiplying depth.
    • Replace CCX (Toffoli) gates with the 6-CNOT relative-phase variant where phase accuracy is not required.
    • Split long circuits into subcircuits and use mid-circuit measurement or classical feed-forward where the algorithm allows.
  • 2

    Transpilation fails or produces wrong connectivity

    • Qiskit
    • Hardware
    Cause

    Your circuit applies two-qubit gates between qubits that are not directly connected on the hardware's coupling map. The transpiler must insert SWAP gates to route qubits to adjacent positions, and if it fails or makes poor choices, the result is wrong or excessively deep.

    Diagnose

    from qiskit_ibm_runtime import QiskitRuntimeService
    
    service = QiskitRuntimeService()
    backend = service.backend("ibm_sherbrooke")
    
    # Print the coupling map to see which qubit pairs can interact directly
    print(backend.coupling_map)

    Fix: specify an initial layout that respects connectivity

    from qiskit import transpile
    
    # Place your logical qubits on physically connected hardware qubits
    # Check the coupling map first: [0, 1] and [1, 2] are connected on most IBM devices
    transpiled = transpile(
        qc,
        backend=backend,
        initial_layout=[0, 1, 2],   # maps qc qubit 0 -> hw qubit 0, etc.
        optimization_level=3,
    )
    print("SWAP count:", transpiled.count_ops().get('swap', 0))

    Alternative: let Sabre routing pick the layout

    With optimization_level=3 and no initial_layout, Qiskit's Sabre layout pass tries multiple random seeds and picks the lowest-SWAP result. This is often better than a manual layout unless you have specific knowledge of qubit quality.

  • 3

    Measurement results are all zeros (or one repeated value)

    • Qiskit
    • Cirq
    • PennyLane
    • Beginner
    Cause

    The most common cause is forgetting to add measurement gates. Qubits initialise to |0⟩, so if no gates run (or gates run but measurement comes before them), you always measure zero. The second common cause is measuring before applying gates due to incorrect barrier placement.

    Fix: add measurements after all gates

    from qiskit import QuantumCircuit
    from qiskit_aer import AerSimulator
    
    qc = QuantumCircuit(2, 2)
    qc.h(0)
    qc.cx(0, 1)
    
    # Measurements must come AFTER all gates
    qc.measure([0, 1], [0, 1])
    # OR: qc.measure_all()  -- adds measurements to all qubits
    
    sim = AerSimulator()
    job = sim.run(qc, shots=1024)
    print(job.result().get_counts())

    Check barrier placement

    A qc.barrier() is only a visual/scheduling hint; it does not cause measurements to run early. But if you used measure_all() and then added more gates, those gates run after measurement on the classical register and have no effect on measured outcomes. Always draw the circuit to confirm order: print(qc.draw()).

  • 4

    "QubitCount exceeds device maximum"

    • Qiskit
    • Hardware
    • Simulation
    Cause

    Your circuit uses more qubits than the target backend supports. IBM's free-tier systems range from 5 to 127 qubits. Attempting to run a 40-qubit circuit on a 27-qubit system will fail at submission.

    Fixes

    • For simulation: AerSimulator handles 30+ qubits with the statevector method, and significantly more with the matrix product state method (see Problem 10).
    • For hardware: choose a backend with enough qubits, or reduce your algorithm's qubit count by reusing qubits (reset and reuse mid-circuit with qc.reset()).
    from qiskit_aer import AerSimulator
    
    # Statevector handles ~30 qubits; MPS handles wider circuits
    sim = AerSimulator(method='statevector')  # up to ~30 qubits
    sim_mps = AerSimulator(method='matrix_product_state')  # handles more
    
    # Check a backend's qubit count before submitting
    from qiskit_ibm_runtime import QiskitRuntimeService
    service = QiskitRuntimeService()
    for b in service.backends():
        print(b.name, b.num_qubits)
  • 5

    Gate not supported on target backend

    • Qiskit
    • Hardware
    Cause

    Hardware backends only implement a small native gate set. IBM systems typically support cx (or ecr), id, rz, sx, and x. Using higher-level gates like CCX (Toffoli), SWAP, or custom unitaries requires decomposition.

    Fix: transpile with explicit basis_gates

    from qiskit import transpile, QuantumCircuit
    
    qc = QuantumCircuit(3)
    qc.ccx(0, 1, 2)   # Toffoli: not native on most superconducting hardware
    
    # transpile() decomposes CCX into cx + u gates automatically
    transpiled = transpile(
        qc,
        basis_gates=['cx', 'u'],
        optimization_level=2,
    )
    print(transpiled.draw())
    print("Gate counts:", transpiled.count_ops())

    To see what gates a specific backend supports natively:

    print(backend.basis_gates)
    # e.g. ['cx', 'id', 'rz', 'sx', 'x']

Noisy Results and Accuracy

  • 6

    Results look random / no clear answer from hardware

    • Qiskit
    • Hardware
    • Noise
    Cause

    Either the algorithm is being overwhelmed by gate errors and decoherence, the shot count is too low to see a clear signal, or the algorithm itself is not producing a peaked distribution (a design issue).

    Diagnose: run on a noiseless simulator first

    from qiskit_aer import AerSimulator
    
    # Step 1: confirm the algorithm works without noise
    sim = AerSimulator(method='statevector')
    job = sim.run(qc, shots=4096)
    counts = job.result().get_counts()
    print("Noiseless result:", counts)
    
    # Step 2: add a noise model from real hardware to test noise impact
    from qiskit_aer.noise import NoiseModel
    from qiskit_ibm_runtime import QiskitRuntimeService
    
    service = QiskitRuntimeService()
    backend = service.backend("ibm_nairobi")
    noise_model = NoiseModel.from_backend(backend)
    
    noisy_sim = AerSimulator(noise_model=noise_model)
    noisy_job = noisy_sim.run(qc, shots=4096)
    print("Noisy result:", noisy_job.result().get_counts())

    Fixes

    • If the noiseless simulator is correct and hardware is not: reduce circuit depth, apply error mitigation (see Qiskit's ZNE or M3 packages), or use a higher-fidelity backend.
    • If both are wrong: the algorithm has a design flaw. Verify with a small test case on the statevector simulator.
    • Increase shots to at least 4096 before concluding results are random. Low shot counts produce high variance on even correct circuits.
  • 7

    VQE / QAOA not converging

    • Qiskit
    • PennyLane
    • Variational
    Cause

    Several independent problems can cause this. The most common are: barren plateaus (gradients vanish in deep circuits), a poor initial parameter choice, an ansatz that cannot express the target state, too few layers, or using a gradient-based optimizer on hardware (where shot noise corrupts gradient estimates).

    Diagnose: plot cost vs iteration

    import numpy as np
    import matplotlib.pyplot as plt
    from qiskit.primitives import StatevectorEstimator
    from qiskit_algorithms import VQE
    from qiskit_algorithms.optimizers import SPSA
    
    costs = []
    
    def callback(nfev, params, energy, stddev):
        costs.append(energy)
    
    vqe = VQE(
        estimator=StatevectorEstimator(),
        ansatz=ansatz,
        optimizer=SPSA(maxiter=200),
        callback=callback,
        initial_point=np.random.uniform(-np.pi, np.pi, ansatz.num_parameters),
    )
    result = vqe.compute_minimum_eigenvalue(hamiltonian)
    
    plt.plot(costs)
    plt.xlabel("Iteration")
    plt.ylabel("Energy")
    plt.title("VQE convergence")
    plt.show()

    Fixes

    • Barren plateaus: reduce circuit depth, switch to local cost functions, use layerwise training (freeze earlier layers while training new ones).
    • Bad ansatz: for chemistry problems, use UCCSD (Unitary Coupled Cluster Singles and Doubles) rather than a hardware-efficient ansatz. The expressibility needs to match the problem.
    • Hardware optimizers: use SPSA instead of gradient descent on real hardware. SPSA estimates gradients with only two circuit evaluations regardless of parameter count, and tolerates shot noise better than finite-difference methods.
    • Initial parameters: try the INTERP strategy: solve a simpler version of the problem first (e.g. fewer QAOA layers) and use those optimal parameters as the starting point for the harder version.
  • 8

    Expectation values are off by a constant factor

    • Qiskit
    • PennyLane
    • Variational
    Cause

    The Hamiltonian is not correctly normalised, observable coefficients have wrong signs, or the Pauli decomposition was computed incorrectly. This produces results that are consistently scaled or shifted from the true value.

    Fix: verify the Pauli decomposition

    import numpy as np
    from qiskit.quantum_info import SparsePauliOp
    
    # If you built the Hamiltonian manually, verify it matches the matrix form
    H_matrix = np.array([[-1, 0], [0, 1]], dtype=complex)
    
    H_pauli = SparsePauliOp.from_operator(H_matrix)
    print("Pauli decomposition:", H_pauli)
    # Should give: -1.0 * I + 1.0 * Z  ->  SparsePauliOp(['I', 'Z'], coeffs=[-1.+0.j,  1.+0.j])
    
    # Check that reconstructing the matrix matches the original
    H_reconstructed = H_pauli.to_matrix()
    print("Max error:", np.max(np.abs(H_matrix - H_reconstructed)))

    Also check the sign convention for the Hamiltonian in the algorithm. VQE minimises energy, so it expects H to be defined such that the ground state has the lowest eigenvalue. If your H has the wrong sign, VQE will converge to the wrong state.

  • 9

    How many shots do I actually need?

    • Simulation
    • Hardware
    Cause

    Statistical error in measurement outcomes scales as 1/√(shots). Doubling accuracy requires quadrupling shots. Using too few shots is one of the most common reasons results look noisy even on a simulator.

    Target accuracy Minimum shots Typical use case
    10% 100 Quick sanity check
    3% 1,024 Prototyping, debugging
    1% 10,000 Algorithm validation
    0.3% 100,000 Publication-quality on simulator
    0.1% 1,000,000 Statevector simulator only (free)

    Practical advice: use 1024 shots for prototyping. Use 8192 for results you plan to report. When shot noise is not what you are studying, use AerSimulator(method='statevector') with no shot limit; it gives exact probabilities and runs faster than 1,000,000 shots on large circuits.

Simulation Performance

  • 10

    Simulation is too slow

    • Qiskit
    • Simulation
    Cause

    Statevector simulation stores 2n complex amplitudes. At 28 qubits that is 268 million numbers. At 30 qubits it is over a billion. Runtime grows exponentially, and at some point simulation simply stalls.

    Fix: use the right simulator method for your circuit

    from qiskit_aer import AerSimulator
    
    # Default statevector: exact, but limited to ~30 qubits
    sim_sv = AerSimulator(method='statevector')
    
    # Matrix product state: efficient for low-entanglement circuits
    # Handles 100+ qubits when entanglement is local
    sim_mps = AerSimulator(method='matrix_product_state')
    
    # Stabilizer: exponentially fast for Clifford-only circuits (H, S, CNOT, measure)
    sim_clifford = AerSimulator(method='stabilizer')
    
    # GPU-accelerated statevector: requires aer-gpu package and CUDA GPU
    # pip install qiskit-aer-gpu
    sim_gpu = AerSimulator(method='statevector', device='GPU')

    Which method to choose

    • MPS: best for circuits where qubits interact only with nearby qubits (e.g. 1D variational circuits, QAOA on ring graphs). Accuracy degrades for highly entangled states.
    • Stabilizer: only works for Clifford circuits (no T gates, no arbitrary rotations), but is exact and runs in polynomial time.
    • GPU: best if you have a CUDA GPU and need statevector accuracy on 25-32 qubits. A GPU with 16 GB VRAM handles about 30 qubits.
  • 11

    Memory error during simulation

    • Qiskit
    • Simulation
    Cause

    Statevector simulation of n qubits requires 2n complex128 values. At 30 qubits that is 16 GB RAM. At 32 qubits it is 64 GB. Most laptops crash at 26-28 qubits.

    Qubits Statevector RAM Feasibility
    2016 MBFine on any machine
    25512 MBFine on any machine
    284 GBOK on most laptops
    3016 GBNeeds a workstation
    3264 GBServer only
    34256 GBAWS SV1 cloud simulator

    Fixes

    • Switch to method='matrix_product_state' if your circuit has limited entanglement (see Problem 10).
    • Use AWS Braket SV1 (supports up to 34 qubits) or the local MPS simulator for wider circuits.
    • Restructure the algorithm to use fewer qubits. Many algorithms can be run in segments, reusing qubits via mid-circuit reset: qc.reset(qubit).
  • 12

    Circuit serialisation / pickling errors

    • Qiskit
    • Simulation
    Cause

    ParameterVector objects and custom gate classes do not always serialise cleanly with Python's pickle. This surfaces when caching circuits to disk, passing them between processes, or submitting to cloud job queues.

    Fix: assign parameters before serialising, or use OpenQASM

    from qiskit import QuantumCircuit
    from qiskit.circuit import ParameterVector
    from qiskit import qasm3
    import json
    
    # Parameterised circuit
    params = ParameterVector('theta', 3)
    qc = QuantumCircuit(3)
    qc.ry(params[0], 0)
    qc.ry(params[1], 1)
    qc.ry(params[2], 2)
    
    # Option 1: assign parameters before pickling
    bound_qc = qc.assign_parameters({params[0]: 1.2, params[1]: 0.5, params[2]: 0.9})
    import pickle
    data = pickle.dumps(bound_qc)   # now safe to serialise
    
    # Option 2: serialise as OpenQASM 3 (text format, always portable)
    qasm_str = qasm3.dumps(qc)
    with open('circuit.qasm', 'w') as f:
        f.write(qasm_str)
    
    # Load it back
    qc_loaded = qasm3.loads(qasm_str)

Hardware Access and Jobs

  • 13

    IBM job stuck in queue for hours

    • Qiskit
    • Hardware
    Cause

    IBM Quantum's free open tier is heavily used. The most popular systems (Eagle 127-qubit, Heron r2 156-qubit) can have queues of 50-100 jobs. Free-tier jobs also have lower priority than paid plans.

    Fix: find the least-busy backend

    from qiskit_ibm_runtime import QiskitRuntimeService
    
    service = QiskitRuntimeService()
    
    # Get all operational backends and sort by pending jobs
    backends = service.backends(operational=True, simulator=False)
    ranked = sorted(backends, key=lambda b: b.status().pending_jobs)
    
    for b in ranked[:5]:
        status = b.status()
        print(f"{b.name}: {status.pending_jobs} pending jobs, {b.num_qubits} qubits")

    Other strategies

    • Run on the local AerSimulator while your hardware job queues. For most debugging purposes the simulator result is sufficient.
    • Submit jobs during off-peak hours (late night UTC) when queues are shorter.
    • Use a different IBM Quantum region if available on your plan (US, EU, AP regions have separate queues).
  • 14

    Job failed with "too many gates" on hardware

    • Qiskit
    • Hardware
    Cause

    Hardware backends impose per-job gate count limits that vary by provider and plan. IBM's open plan has lower limits than premium plans. Submitting very deep circuits or batch circuits that concatenate many experiments into one job can trigger this error.

    Fixes

    • Reduce circuit depth with optimization_level=3 during transpilation (see Problem 1).
    • Split a batch of circuits into smaller submission chunks.
    • Check the gate count before submitting: transpiled.count_ops() shows gate counts by type.
    from qiskit import transpile
    
    transpiled = transpile(qc, backend=backend, optimization_level=3)
    
    # Inspect total gate count
    ops = transpiled.count_ops()
    total_gates = sum(ops.values())
    print(f"Total gates: {total_gates}")
    print(f"Two-qubit gates: {ops.get('cx', 0) + ops.get('ecr', 0)}")
    
    # Check circuit depth
    print(f"Circuit depth: {transpiled.depth()}")
  • 15

    IonQ / Quantinuum results look wrong

    • Qiskit
    • Hardware
    Cause

    IonQ and Quantinuum use different native gate sets from IBM. IonQ's native gates are single-qubit rotations and the two-qubit XX interaction. Quantinuum uses the ZZMax and PhasedX gates. Applying a transpiler optimised for IBM's cx + u basis to these backends produces incorrect or inefficient circuits.

    Fix: use provider-recommended transpilation

    • Quantinuum: use pytket-quantinuum which applies Quantinuum-specific optimisation passes and compiles to native ZZMax gates.
    • IonQ: use the Amazon Braket SDK or Azure Quantum SDK which handle IonQ transpilation to GPi, GPi2, and MS (Molmer-Sorensen) gates natively.
    • Check published two-qubit gate fidelity for the specific system you are targeting before assuming results are wrong. Quantinuum H2 achieves ~99.9% but IonQ's Aria is closer to 97-99% for typical circuits.

PennyLane-specific

  • 16

    Barren plateau: gradients are all zero

    • PennyLane
    • Variational
    Cause

    Deep wide circuits with randomly initialised parameters have exponentially vanishing gradients. The cost function landscape is essentially flat everywhere, so gradient descent makes no progress. This is the barren plateau problem and it is not a bug in your code; it is a fundamental property of highly entangled parameterised circuits.

    Diagnose: check gradient variance across random initialisations

    import pennylane as qml
    import numpy as np
    
    dev = qml.device("default.qubit", wires=6)
    
    @qml.qnode(dev)
    def circuit(params):
        for i in range(6):
            qml.RY(params[i], wires=i)
        for i in range(5):
            qml.CNOT(wires=[i, i + 1])
        for i in range(6):
            qml.RY(params[i + 6], wires=i)
        return qml.expval(qml.PauliZ(0))
    
    # Measure gradient variance over many random parameter sets
    grad_fn = qml.grad(circuit)
    variances = []
    for _ in range(100):
        params = np.random.uniform(0, 2 * np.pi, 12)
        grad = grad_fn(params)
        variances.append(np.var(grad))
    
    print(f"Mean gradient variance: {np.mean(variances):.2e}")
    # If this is below 1e-4, you likely have a barren plateau

    Fixes

    • Layerwise training: train one layer at a time, freezing previously trained layers. This avoids initialising all parameters randomly at the same time.
    • Local cost functions: instead of measuring a global observable (sum over all qubits), use a local observable (single qubit) which has gradients that vanish only polynomially, not exponentially.
    • Reduce depth: barren plateaus worsen with circuit depth. Fewer layers with higher expressibility per layer is better than many shallow layers.
    • Identity initialisation: initialise parameters so the circuit is close to the identity at the start. Gradients are larger near the identity.

    See also: PennyLane noise mitigation tutorial for related techniques.

  • 17

    QNode incompatible with NumPy arrays

    • PennyLane
    • Autograd
    Cause

    PennyLane uses its own differentiable tensor types. Mixing standard numpy arrays with PennyLane's autograd inside the same computation graph breaks gradient tracking and often raises a TypeError or silently returns zero gradients.

    Fix: use qml.numpy or a framework tensor consistently

    import pennylane as qml
    import pennylane.numpy as pnp   # use this, not plain numpy
    import numpy as np
    
    dev = qml.device("default.qubit", wires=2)
    
    @qml.qnode(dev, interface="autograd")
    def circuit(params):
        qml.RY(params[0], wires=0)
        qml.RY(params[1], wires=1)
        qml.CNOT(wires=[0, 1])
        return qml.expval(qml.PauliZ(0))
    
    # WRONG: plain numpy array, gradients will not work
    # params = np.array([0.5, 1.2])
    
    # CORRECT: use pnp (pennylane numpy) and mark as requiring grad
    params = pnp.array([0.5, 1.2], requires_grad=True)
    
    grad_fn = qml.grad(circuit)
    print("Gradient:", grad_fn(params))

    For PyTorch or JAX users

    import torch
    
    @qml.qnode(dev, interface="torch")
    def circuit_torch(params):
        qml.RY(params[0], wires=0)
        qml.RY(params[1], wires=1)
        return qml.expval(qml.PauliZ(0))
    
    # Use torch tensors throughout; do NOT mix with numpy in the same graph
    params_torch = torch.tensor([0.5, 1.2], requires_grad=True)
    result = circuit_torch(params_torch)
    result.backward()
    print("Gradient:", params_torch.grad)

Quick Reference

# Problem Likely cause Quick fix Framework
1 Results are garbage on hardware Circuit too deep, decoherence optimization_level=3 Qiskit
2 Transpilation fails / bad connectivity Qubits not adjacent on coupling map Set initial_layout Qiskit
3 All measurements are zero Missing measurement gates qc.measure_all() after gates Any
4 Too many qubits for device Circuit exceeds backend qubit count Use AerSimulator or MPS method Qiskit
5 Gate not supported on backend Non-native gate (e.g. CCX) transpile(basis_gates=['cx','u']) Qiskit
6 Results look random on hardware Noise, too few shots Test on noiseless sim first; use noise model Qiskit
7 VQE / QAOA not converging Barren plateau, bad ansatz, wrong optimizer SPSA optimizer, UCCSD ansatz, INTERP init Qiskit, PennyLane
8 Expectation value off by constant Hamiltonian sign or normalisation error Verify Pauli decomposition against matrix Qiskit, PennyLane
9 Results too noisy (shots) Insufficient shots 8192+ shots; use statevector for exact results Any
10 Simulation too slow Statevector scales as 2^n MPS method or GPU-accelerated Aer Qiskit / Aer
11 Memory error during simulation 30+ qubits exceeds RAM MPS simulator or AWS SV1 Qiskit / Aer
12 Circuit serialisation / pickle error ParameterVector doesn't pickle cleanly assign_parameters() or qasm3.dumps() Qiskit
13 IBM job stuck in queue High demand on free tier Sort backends by pending_jobs Qiskit
14 Job failed: too many gates Per-job gate limit exceeded Reduce depth; split circuit into chunks Qiskit
15 IonQ / Quantinuum results wrong Wrong transpiler for native gate set Use pytket / Braket SDK for native compilation pytket, Braket
16 Barren plateau: zero gradients Deep wide circuit, random init Layerwise training, local cost function PennyLane
17 QNode incompatible with numpy Mixed tensor types in computation graph Use qml.numpy or torch tensors consistently PennyLane