Circuit Transpilation in Qiskit

When you write a quantum circuit in Qiskit, you write it in the abstract: any gate on any qubit, with no concern for hardware connectivity or native gate sets. The transpiler bridges that gap. It takes your ideal circuit and produces one that a specific backend can actually run: mapping virtual qubits to physical qubits, inserting SWAP gates to route around connectivity constraints, decomposing gates into the device’s native set, and optimizing to reduce gate count.

Transpilation is where most circuits get significantly worse, or significantly better, depending on how you use it.

Installation

pip install qiskit qiskit-aer qiskit-ibm-runtime

Why Transpilation Is Necessary

Suppose you write a three-qubit circuit with a Toffoli gate between qubits 0, 1, and 2. On a real IBM device:

Qubits 0, 1, 2 may not all be physically connected to each other.
The device has no native Toffoli gate; it uses a small native set such as {CZ, RZ, SX, X} on Heron processors.
Qubit 0 on the device may have better coherence properties than qubit 2, so the mapping matters.

The transpiler resolves all three issues before the circuit reaches the device.

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT

# Build an abstract 4-qubit QFT circuit
qc = QFT(4, do_swaps=True)
print("Before transpilation:")
print(f"  Gates: {qc.count_ops()}")
print(f"  Depth: {qc.depth()}")

The Four Transpilation Stages

Qiskit’s transpiler runs four sequential stages:

1. Initial Layout

Maps virtual qubits (your circuit’s qubits) to physical qubits (numbered positions on the device). A good layout minimizes the number of SWAP gates needed in the next stage.

Common layout strategies:

TrivialLayout: virtual qubit 0 -> physical qubit 0, etc. Fast, rarely optimal.
VF2Layout: searches for a perfect (zero-SWAP) layout via subgraph isomorphism. Tried first at optimization_level >= 1.
SabreLayout: uses the SABRE heuristic to co-optimize layout and routing simultaneously. The fallback workhorse at optimization_level >= 1, with more trials at higher levels.
DenseLayout: finds the most connected subgraph that fits your circuit. No longer a preset default, but available via layout_method="dense".

2. Routing

Ensures every two-qubit gate acts on physically connected qubits. If two qubits that need to interact are not adjacent, the router inserts SWAP gates to move quantum information to neighboring qubits.

SWAPs are expensive: on IBM Eagle devices, a CX gate costs roughly 300-500 nanoseconds with error rates around 0.5-1% (Heron CZ gates are faster and closer to 0.3%, but still the dominant error source). Each extra SWAP adds 3 two-qubit gates.

3. Translation (Basis Decomposition)

Converts every gate in the circuit into the device’s native gate set. IBM’s Eagle processors use {CX, RZ, SX, X}; the Heron generation that now makes up most of the fleet uses {CZ, RZ, SX, X}. A Hadamard becomes RZ(pi/2) * SX * RZ(pi/2). A Toffoli becomes 6 two-qubit gates plus single-qubit rotations.

4. Optimization

Applies peephole optimizations and algebraic rewrites to reduce gate count and depth. Cancels inverse gate pairs, merges adjacent single-qubit rotations into one, and removes identity operations.

Basic Usage

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT

backend = AerSimulator()

qc = QFT(5, do_swaps=True)
qc.measure_all()

# Transpile at optimization level 2 (recommended default)
t_qc = transpile(qc, backend, optimization_level=2, seed_transpiler=42)

print("After transpilation:")
print(f"  Gates: {t_qc.count_ops()}")
print(f"  Depth: {t_qc.depth()}")
print(f"  CX count: {t_qc.count_ops().get('cx', 0)}")

Optimization Levels 0-3

The optimization_level parameter is the highest-impact knob you have:

Level	Layout	Routing	Optimization passes	Use when
0	Trivial	Sabre	None	Debugging layout issues
1	VF2 -> Sabre	Sabre	Light (1-2Q merge)	Fast iteration
2	VF2 -> Sabre (more trials)	Sabre	Medium	Default for most work
3	VF2 -> Sabre (most trials)	Sabre	Heavy (Synthesis)	Final runs, deep circuits

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator

backend = AerSimulator()
qc = QuantumCircuit(4)
qc.h(0)
qc.cx(0, 1); qc.cx(1, 2); qc.cx(2, 3)
qc.ry(0.5, 1); qc.rz(1.2, 2)
qc.measure_all()

for level in range(4):
    t = transpile(qc, backend, optimization_level=level, seed_transpiler=0)
    print(f"Level {level}: depth={t.depth():3d}, gates={sum(t.count_ops().values()):3d}")

Inspecting the Transpiled Circuit

Always check what the transpiler produced:

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator

backend = AerSimulator()

qc = QuantumCircuit(3)
qc.ccx(0, 1, 2)   # Toffoli
qc.measure_all()

t_qc = transpile(qc, backend, optimization_level=2)

print("Gate counts:", dict(t_qc.count_ops()))
print("Circuit depth:", t_qc.depth())
print(t_qc.draw(output="text", fold=100))

The Toffoli will expand to roughly 6 CX gates. Seeing this helps you understand the true cost of high-level gates.

Custom Initial Layout

If you know which physical qubits have the best error rates or connectivity for your circuit, pin the mapping:

from qiskit import QuantumCircuit, transpile
from qiskit_ibm_runtime.fake_provider import FakeBrisbane

backend = FakeBrisbane()   # Fake backend with real connectivity/noise model

qc = QuantumCircuit(3)
qc.h(0); qc.cx(0, 1); qc.cx(1, 2)
qc.measure_all()

# Default: let the transpiler choose
t_default = transpile(qc, backend, optimization_level=2, seed_transpiler=0)

# Custom: pin virtual qubit 0 -> physical 0, virt 1 -> phys 1, virt 2 -> phys 2
t_custom = transpile(
    qc, backend,
    optimization_level=2,
    initial_layout=[0, 1, 2],
    seed_transpiler=0,
)

print(f"Default SWAP count: {t_default.count_ops().get('swap', 0)}")
print(f"Custom SWAP count:  {t_custom.count_ops().get('swap', 0)}")

Pinning to physically connected qubits can eliminate all SWAP insertion.

PassManager: Custom Transpilation Pipelines

For fine-grained control, build your own transpilation pipeline with PassManager:

from qiskit.transpiler import PassManager
from qiskit.transpiler.passes import (
    TrivialLayout,
    BasicSwap,
    Decompose,
    Optimize1qGatesDecomposition,
    CommutativeCancellation,
)
from qiskit.transpiler import CouplingMap
from qiskit import QuantumCircuit

# A simple linear chain: 0-1-2-3
coupling_map = CouplingMap.from_line(4)

pm = PassManager([
    TrivialLayout(coupling_map),
    BasicSwap(coupling_map),
    Decompose(),
    Optimize1qGatesDecomposition(basis=["rz", "sx", "x"]),
    CommutativeCancellation(),
])

qc = QuantumCircuit(4)
qc.h(0); qc.cx(0, 3)   # non-adjacent: will need SWAPs
qc.measure_all()

result = pm.run(qc)
print(f"SWAP-inserted circuit depth: {result.depth()}")

PassManager gives you surgery-level control: you can insert just the routing pass, skip optimization, or add custom analysis passes.

Native Gate Sets by Platform

When targeting real hardware, know what the transpiler is translating into:

Platform	Native gate set
IBM (Falcon, Eagle)	CX, RZ, SX, X
IBM (Heron)	CZ, RZ, SX, X
Quantinuum H-series	ZZPhase, Rz, Rx, PhasedX
IonQ Aria	GPi, GPi2, MS (Molmer-Sorensen)
Google Sycamore	SYC (Sycamore gate), PhasedXZ

Understanding the native set helps you write circuits that transpile cheaply. For IBM, circuits built directly from the target device’s native gates require no translation pass at all.

Simulator vs Real Hardware

AerSimulator supports abstract gates natively; no translation is required. However, to simulate realistic noise, you should still transpile to the backend’s gate set before adding a noise model:

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel
from qiskit_ibm_runtime.fake_provider import FakeBrisbane

fake_backend = FakeBrisbane()
noise_model = NoiseModel.from_backend(fake_backend)

# Must transpile to the backend's native gates before applying its noise model
sim = AerSimulator(noise_model=noise_model)

qc = QuantumCircuit(2)
qc.h(0); qc.cx(0, 1)
qc.measure_all()

# Transpile targeting the fake backend's gate set and coupling map
t_qc = transpile(qc, fake_backend, optimization_level=1)
result = sim.run(t_qc, shots=2048).result()
print(result.get_counts())

Without transpilation, the noise model’s gate-specific error rates cannot be applied correctly.

Routing Heuristics: Which Layout to Use

TrivialLayout: use when you have already analyzed connectivity and want deterministic, zero-SWAP placement. Rarely correct for arbitrary circuits.
DenseLayout: good for circuits where the virtual qubit graph roughly matches the device graph. Faster than SABRE, but no longer used by default at any preset level.
SabreLayout: the best general-purpose choice and the preset default. Runs SABRE multiple times and picks the lowest-SWAP result. Use this for production runs.

from qiskit import QuantumCircuit, transpile
from qiskit_ibm_runtime.fake_provider import FakeSherbrooke

backend = FakeSherbrooke()  # 127-qubit Eagle r3

qc = QuantumCircuit(10)
for i in range(9):
    qc.cx(i, i + 1)
qc.measure_all()

for layout in ["trivial", "dense", "sabre"]:
    t = transpile(qc, backend, layout_method=layout, routing_method="sabre",
                  optimization_level=1, seed_transpiler=0)
    print(f"{layout:7s}: depth={t.depth():3d}, swaps={t.count_ops().get('swap', 0):2d}")

What to Try Next

Use qiskit.transpiler.preset_passmanagers.generate_preset_pass_manager() for full control over the preset pipeline
Profile large circuits with PassManager instrumentation to find which passes consume the most time
Compare transpiled circuit fidelity empirically by running on a real backend and computing state tomography
Read the Qiskit transpiler documentation for the complete pass reference