Custom Compilation Passes in TKET
Build and compose custom compilation pass sequences in pytket. Learn how to rebase to hardware native gates, apply optimization passes, and profile circuit depth improvements.
TKET’s compilation system is built around composable BasePass objects. Every optimization is a pass, and passes can be combined into sequences, applied conditionally, or repeated until convergence. This tutorial goes beyond the built-in FullPeepholeOptimise shortcut and shows how to construct custom pass pipelines, rebase to hardware native gate sets, route circuits onto real device topologies, and measure the effect of each stage.
Installation
pip install pytket
# For IBM backend support:
pip install pytket-qiskit
# For IonQ backend support:
pip install pytket-ionq
Pass Taxonomy
pytket ships with dozens of compilation passes, each targeting a specific transformation. Understanding which passes exist and what category they belong to is the first step toward building effective pipelines.
The following table organizes the most commonly used passes by category.
| Category | Pass | What it does |
|---|---|---|
| Synthesis | SynthesiseTket | Re-synthesizes 2-qubit subcircuits using KAK decomposition into TK1 + CX |
| Synthesis | SynthesisePauliGraph | Synthesizes circuits from a Pauli exponential graph representation |
| Synthesis | PauliSimp | Simplifies sequences of Pauli exponentials before synthesis |
| Optimization | FullPeepholeOptimise | Aggressive peephole optimization assuming TK1/CX basis |
| Optimization | PeepholeOptimise2Q | Optimizes 2-qubit subcircuits via KAK decomposition |
| Optimization | CliffordSimp | Simplifies Clifford subcircuits using tableau algebra |
| Reduction | RemoveRedundancies | Removes gate-inverse pairs, identity rotations, and zero-angle gates |
| Reduction | CommuteThroughMultis | Commutes single-qubit gates through multi-qubit gates to expose cancellations |
| Reduction | RemoveBarriers | Strips barrier instructions from the circuit |
| Routing | CXMappingPass | Routes a circuit onto a connectivity graph using CX-based SWAP insertion |
| Routing | DefaultMappingPass | Convenience pass that picks placement and routing automatically |
| Routing | RoutingPass | Inserts SWAPs to satisfy architecture constraints without placement |
| Rebase | RebaseTket | Converts all gates to the TKET canonical set (TK1 + CX) |
| Rebase | RebaseCustom | Converts all gates to a user-specified single-qubit and two-qubit gate set |
| Rebase | auto_rebase_pass | Automatically selects a rebase pass for a given target gate set |
| Verification | GateSetPredicate | Checks that all gates belong to an allowed set (used with conditional passes) |
| Verification | ConnectivityPredicate | Checks that all two-qubit gates respect a given architecture |
Each pass exposes an apply(circuit) method that mutates the circuit in place and returns a boolean indicating whether any changes were made.
The Pass System
Every compilation pass in pytket inherits from BasePass and exposes an apply method that mutates a Circuit in place. Passes are composed with SequencePass.
from pytket.passes import (
SequencePass,
FullPeepholeOptimise,
RebaseCustom,
CommuteThroughMultis,
RemoveRedundancies,
SynthesiseTket,
)
from pytket.circuit import Circuit
# A simple 4-qubit circuit to optimize
circ = Circuit(4)
circ.H(0).CX(0, 1).CX(1, 2).CX(2, 3)
circ.Rz(0.5, 0).Rz(0.5, 0) # redundant pair that can be merged
circ.CX(0, 1).H(0)
print("Before:", circ.n_gates, "gates, depth", circ.depth())
pass_sequence = SequencePass([
CommuteThroughMultis(),
RemoveRedundancies(),
SynthesiseTket(),
])
pass_sequence.apply(circ)
print("After:", circ.n_gates, "gates, depth", circ.depth())
CommuteThroughMultis commutes single-qubit gates through two-qubit gates when the commutativity rules allow it, exposing cancellation opportunities. RemoveRedundancies removes gate-inverse pairs and identity rotations. SynthesiseTket re-synthesizes small subcircuits using TKET’s internal KAK-based decomposer.
How CommuteThroughMultis Works
CommuteThroughMultis exploits the fact that certain single-qubit gates commute with multi-qubit gates on specific qubits. The pass checks commutativity rules based on the gate’s eigenbasis and the structure of the multi-qubit gate.
The core rules for CX gates are:
- Control qubit (qubit 0 of CX): The control qubit of a CX gate is diagonal in the Z basis. Any gate that is also diagonal in the Z basis commutes with CX on the control. This includes Rz, T, Tdg, S, Sdg, and Z gates. Intuitively, CX applies a conditional X to the target based on whether the control is |1>, and Z-basis rotations do not change the population in |0> vs |1>.
- Target qubit (qubit 1 of CX): The target qubit undergoes a conditional X. Gates that commute with X (that is, X-basis diagonal gates like Rx and X itself) commute through CX on the target.
- Non-commuting example: An Rx gate on the control qubit does not commute through CX, because Rx changes the Z-basis populations and alters the conditional behavior of the control.
When a single-qubit gate commutes through a CX, it can slide past it to potentially cancel with another gate on the other side.
from pytket.circuit import Circuit
from pytket.passes import CommuteThroughMultis
# Before: Rz on qubit 0, then CX(0,1)
circ = Circuit(2)
circ.Rz(0.5, 0)
circ.CX(0, 1)
print("Before commutation:")
for cmd in circ.get_commands():
print(f" {cmd}")
# Apply CommuteThroughMultis
CommuteThroughMultis().apply(circ)
print("\nAfter commutation:")
for cmd in circ.get_commands():
print(f" {cmd}")
# The Rz(0.5) on qubit 0 commutes through the CX on the control qubit,
# so it moves after the CX. The unitary is preserved.
In a larger circuit, this commutation can push an Rz past a CX to merge with another Rz on the same qubit, and RemoveRedundancies then combines them into a single rotation. This is why running CommuteThroughMultis before RemoveRedundancies is more effective than running RemoveRedundancies alone.
from pytket.circuit import Circuit
from pytket.passes import CommuteThroughMultis, RemoveRedundancies
# Rz(0.3) -- CX -- Rz(0.7) on the same qubit
# Without commutation, RemoveRedundancies cannot merge the two Rz gates.
circ = Circuit(2)
circ.Rz(0.3, 0)
circ.CX(0, 1)
circ.Rz(0.7, 0)
print("Initial gate count:", circ.n_gates)
# Just RemoveRedundancies alone cannot help here
circ_copy = circ.copy()
RemoveRedundancies().apply(circ_copy)
print("After RemoveRedundancies only:", circ_copy.n_gates)
# But CommuteThroughMultis + RemoveRedundancies can merge the Rz gates
CommuteThroughMultis().apply(circ)
RemoveRedundancies().apply(circ)
print("After commute + remove:", circ.n_gates)
KAK Decomposition in SynthesiseTket
SynthesiseTket decomposes any two-qubit unitary into at most 3 CX gates plus single-qubit rotations. It does this using the KAK (Khaneja-Glaser) decomposition, which is a fundamental result from Lie group theory applied to SU(4).
The KAK theorem states that any element of SU(4) can be written as:
U = (A1 ⊗ A2) · exp(i(c_x XX + c_y YY + c_z ZZ)) · (A3 ⊗ A4)
where A1, A2, A3, A4 are single-qubit unitaries and (c_x, c_y, c_z) are the Cartan coordinates (also called interaction coefficients). These three real numbers completely characterize the entangling power of the two-qubit gate.
Key examples of Cartan coordinates:
- Identity: c_x = c_y = c_z = 0 (no entanglement, 0 CX gates needed)
- CNOT: c_x = pi/4, c_y = c_z = 0 (1 CX gate needed)
- iSWAP: c_x = c_y = pi/4, c_z = 0 (2 CX gates needed)
- SWAP: c_x = c_y = c_z = pi/4 (3 CX gates needed)
- Generic unitary: up to 3 CX gates needed
The number of CX gates required depends on how many of the Cartan coordinates are nonzero. If only c_x is nonzero, one CX suffices. If c_x and c_y are nonzero, two CX gates are needed. If all three are nonzero, three CX gates are required.
from pytket.circuit import Circuit, Unitary2qBox, OpType
from pytket.passes import SynthesiseTket, DecomposeBoxes
import numpy as np
# Create a random 2-qubit unitary
rng = np.random.default_rng(42)
# Generate a random unitary using QR decomposition of a random complex matrix
random_matrix = rng.standard_normal((4, 4)) + 1j * rng.standard_normal((4, 4))
q, r = np.linalg.qr(random_matrix)
# Fix the phase to make det = 1
d = np.diag(r)
ph = d / np.abs(d)
q = q @ np.diag(ph)
# Wrap it as a Unitary2qBox
ubox = Unitary2qBox(q)
circ = Circuit(2)
circ.add_unitary2qbox(ubox, 0, 1)
# Decompose the box into gates, then synthesize
DecomposeBoxes().apply(circ)
SynthesiseTket().apply(circ)
# Count CX gates
cx_count = sum(
1 for cmd in circ.get_commands() if cmd.op.type == OpType.CX
)
print(f"CX gate count after KAK decomposition: {cx_count}")
assert cx_count <= 3, "KAK guarantees at most 3 CX gates"
# The circuit now contains only TK1 (single-qubit) and CX (two-qubit) gates
gate_types = set(cmd.op.type for cmd in circ.get_commands())
print(f"Gate types present: {gate_types}")
This is significant for compilation because it means any two-qubit interaction can be expressed with a bounded number of entangling gates. When SynthesiseTket scans a circuit, it identifies two-qubit subcircuits (pairs of qubits that interact), computes their net unitary, and re-synthesizes them using the minimal number of CX gates.
Clifford Simplification with CliffordSimp
Clifford gates form a special subgroup of quantum gates that includes H, S, CX, X, Y, and Z. These gates have a remarkable property: they map Pauli operators to Pauli operators under conjugation. This means that a circuit consisting entirely of Clifford gates can be represented compactly using a tableau (a binary matrix tracking how Paulis transform), and composed or simplified in polynomial time.
CliffordSimp identifies contiguous subcircuits composed entirely of Clifford gates, simplifies them using tableau algebra, and re-inserts the simplified version. This is especially effective for circuits that contain many Clifford gates interspersed with a few non-Clifford rotations (like T gates), which is common in fault-tolerant circuit constructions.
from pytket.circuit import Circuit
from pytket.passes import CliffordSimp
# Build a circuit with many redundant Clifford gates
circ = Circuit(3)
# Layer of Hadamards and CX gates
circ.H(0).H(1).H(2)
circ.CX(0, 1).CX(1, 2)
circ.S(0).S(1).S(2)
circ.CX(0, 1).CX(1, 2)
circ.H(0).H(1).H(2)
# More Clifford operations that partially cancel
circ.X(0).Z(1).Y(2)
circ.CX(0, 1).CX(1, 0).CX(0, 1) # This is a SWAP
circ.H(0).S(0).H(0) # HSH = Sdg (up to phase)
circ.CX(1, 2).CX(1, 2) # Two identical CX gates cancel
gates_before = circ.n_gates
two_qb_before = circ.n_2qb_gates()
print(f"Before CliffordSimp: {gates_before} gates, {two_qb_before} 2-qubit gates")
CliffordSimp().apply(circ)
gates_after = circ.n_gates
two_qb_after = circ.n_2qb_gates()
print(f"After CliffordSimp: {gates_after} gates, {two_qb_after} 2-qubit gates")
print(f"Reduction: {gates_before - gates_after} gates removed")
CliffordSimp is particularly useful as a cleanup pass after routing, because SWAP insertion introduces sequences of three CX gates (which are all Clifford) that can sometimes be simplified in context.
Rebasing to Hardware Native Gates
Each hardware vendor supports a different native gate set. TKET provides RebaseCustom to rewrite any circuit into an arbitrary set of single-qubit and two-qubit primitives.
IBM Native Gates
For IBM devices the native set is {Rz, SX, X, CX}. The single-qubit decomposition converts the universal TK1(a, b, c) gate into a sequence of Rz and SX rotations. TK1(a, b, c) represents an arbitrary single-qubit unitary as Rz(a) Ry(b) Rz(c). To express Ry in terms of Rz and SX, we use the identity Ry(b) = Rz(-0.5) SX Rz(b) SX^dag Rz(0.5), which gives a decomposition into at most 3 Rz and 2 SX gates.
from pytket.passes import RebaseCustom
from pytket.circuit import Circuit, OpType
def tk1_to_rzsx(a, b, c):
"""Decompose TK1(a,b,c) into Rz and SX gates (IBM native)."""
circ = Circuit(1)
circ.Rz(c, 0)
circ.SX(0)
circ.Rz(b, 0)
circ.SX(0)
circ.Rz(a, 0)
return circ
cx_circ = Circuit(2)
cx_circ.CX(0, 1)
ibm_rebase = RebaseCustom(
{OpType.Rz, OpType.SX, OpType.X},
cx_circ,
tk1_to_rzsx,
)
circ = Circuit(3)
circ.H(0).CX(0, 1).T(1).CX(1, 2).Tdg(2)
ibm_rebase.apply(circ)
print("IBM native gates:")
for cmd in circ.get_commands():
print(f" {cmd.op.type.name} on {cmd.qubits}")
IonQ Native Gates
IonQ trapped-ion hardware uses a fundamentally different native gate set. The single-qubit primitive is GPi (and its variant GPi2), which performs rotations on the equator of the Bloch sphere. The two-qubit primitive is the Molmer-Sorensen (MS) gate, a globally entangling operation that creates a maximally entangled state.
The GPi(phi) gate applies a pi-rotation around an axis in the XY plane at angle phi:
GPi(phi) = [[0, e^(-i*phi)], [e^(i*phi), 0]]
The GPi2(phi) gate is a pi/2 rotation around the same axis. Together, GPi and GPi2 can produce any single-qubit rotation. The relationship to the standard Euler angles is:
- Rz(theta) = GPi2(0) GPi2(-theta) (up to global phase)
- Any TK1(a,b,c) decomposes into at most 3 GPi2 gates
The MS gate is the symmetric Molmer-Sorensen interaction: MS = exp(-i pi/4 XX), which is equivalent to a CX up to single-qubit corrections.
from pytket.circuit import Circuit, OpType
from pytket.passes import RebaseCustom
def tk1_to_ionq(a, b, c):
"""Decompose TK1(a,b,c) into Rz and Ry gates for IonQ.
IonQ accepts Rz and Ry as virtual/physical single-qubit gates.
The actual hardware translates these to GPi/GPi2 pulses.
"""
circ = Circuit(1)
circ.Rz(c, 0)
circ.Ry(b, 0)
circ.Rz(a, 0)
return circ
# MS gate is locally equivalent to CX:
# CX = (I ⊗ Ry(-0.5)) MS (I ⊗ Ry(0.5)) (Rz(-0.5) ⊗ Rz(-0.5)) up to phase
# For RebaseCustom, we provide CX directly since pytket handles the
# MS equivalence at the backend level.
cx_replacement = Circuit(2)
cx_replacement.CX(0, 1)
ionq_rebase = RebaseCustom(
{OpType.Rz, OpType.Ry},
cx_replacement,
tk1_to_ionq,
)
circ = Circuit(2)
circ.H(0).CX(0, 1).T(1)
ionq_rebase.apply(circ)
print("IonQ-compatible gates:")
for cmd in circ.get_commands():
print(f" {cmd.op.type.name}({cmd.op.params}) on {cmd.qubits}")
When using the pytket-ionq extension, the IonQBackend automatically applies the appropriate rebase. The manual RebaseCustom shown here is useful for profiling or when building custom pipelines that need to simulate the target gate set without connecting to the actual backend.
Quantinuum (H-Series) Native Gates
Quantinuum’s H-series trapped-ion processors use the native gate set {Rz, PhasedX, ZZPhase}. This is a particularly elegant basis because:
- Rz(t) applies a Z rotation by angle t*pi. On trapped ions, Rz is a virtual gate (implemented by frame tracking) with zero error.
- PhasedX(a, b) applies a rotation of angle api around an axis in the XY plane at azimuthal angle bpi. It generalizes both Rx and Ry: PhasedX(a, 0) = Rx(api) and PhasedX(a, 0.5) = Ry(api).
- ZZPhase(t) applies exp(-i * t * pi/2 * ZZ), a symmetric two-qubit ZZ interaction. This is the native entangling operation on Quantinuum hardware.
The CNOT decomposition into ZZPhase plus single-qubit corrections is:
CX(0,1) = Ry(-0.5, 1) . ZZPhase(0.5, 0, 1) . Ry(0.5, 1) . Rz(-0.5, 0) . Rz(-0.5, 1)
(up to a global phase).
from pytket.circuit import Circuit, OpType
from pytket.passes import RebaseCustom
def tk1_to_phasedx(a, b, c):
"""Decompose TK1(a,b,c) into Rz and PhasedX gates (Quantinuum native).
TK1(a,b,c) = Rz(a) Ry(b) Rz(c)
Ry(b) = PhasedX(b, 0.5)
So TK1(a,b,c) = Rz(a) PhasedX(b, 0.5) Rz(c)
"""
circ = Circuit(1)
circ.Rz(c, 0)
circ.add_gate(OpType.PhasedX, [b, 0.5], [0])
circ.Rz(a, 0)
return circ
# CX decomposition into ZZPhase + single-qubit corrections
cx_replacement = Circuit(2)
cx_replacement.add_gate(OpType.PhasedX, [0.5, 0.5], [1]) # Ry(0.5) on target
cx_replacement.add_gate(OpType.ZZPhase, [0.5], [0, 1])
cx_replacement.add_gate(OpType.PhasedX, [-0.5, 0.5], [1]) # Ry(-0.5) on target
cx_replacement.Rz(-0.5, 0)
cx_replacement.Rz(-0.5, 1)
quantinuum_rebase = RebaseCustom(
{OpType.Rz, OpType.PhasedX},
cx_replacement,
tk1_to_phasedx,
)
circ = Circuit(2)
circ.H(0).CX(0, 1).Rz(0.3, 1)
quantinuum_rebase.apply(circ)
print("Quantinuum-native gates:")
for cmd in circ.get_commands():
print(f" {cmd.op.type.name}({cmd.op.params}) on {cmd.qubits}")
Because Rz is a virtual gate on trapped-ion hardware (zero duration, zero error), the Quantinuum rebase produces circuits where only PhasedX and ZZPhase contribute to the actual execution time and error budget. Optimizing a pipeline for Quantinuum means minimizing PhasedX and ZZPhase counts specifically.
Conditional Pass Application
RepeatWithMetricPass
RepeatWithMetricPass runs a pass repeatedly until a metric function stops improving. This is useful when a single pass application does not converge in one shot.
from pytket.passes import RepeatWithMetricPass, RemoveRedundancies
from pytket.circuit import Circuit
circ = Circuit(3)
circ.CX(0, 1).CX(1, 0).CX(0, 1)
# Repeat RemoveRedundancies until it stops changing the gate count
def gate_count_metric(c):
return c.n_gates
repeat_pass = RepeatWithMetricPass(RemoveRedundancies(), gate_count_metric)
repeat_pass.apply(circ)
print("Gates after repeated removal:", circ.n_gates)
Predicate-Based Conditional Passes
pytket provides predicates that test whether a circuit satisfies certain conditions. You can use GateSetPredicate to check if a circuit already uses the target gate set, and skip the rebase if it does.
from pytket.circuit import Circuit, OpType
from pytket.predicates import GateSetPredicate
# Define what we consider "already rebased to IBM native"
ibm_predicate = GateSetPredicate({
OpType.CX, OpType.Rz, OpType.SX, OpType.X, OpType.Measure,
})
# A circuit already in the IBM native set
circ_native = Circuit(2)
circ_native.Rz(0.3, 0).SX(0).CX(0, 1)
# A circuit NOT in the IBM native set
circ_foreign = Circuit(2)
circ_foreign.H(0).T(1).CX(0, 1)
print("Native circuit satisfies predicate:", ibm_predicate.verify(circ_native))
print("Foreign circuit satisfies predicate:", ibm_predicate.verify(circ_foreign))
# Only rebase if needed
if not ibm_predicate.verify(circ_foreign):
ibm_rebase.apply(circ_foreign)
print("Rebased. Now satisfies predicate:", ibm_predicate.verify(circ_foreign))
This pattern is especially useful in production pipelines where you do not control the input circuit format. Skipping unnecessary passes saves compilation time and avoids introducing redundant gates from an identity rebase.
Placement Strategies
Before routing, TKET must decide which logical qubit maps to which physical qubit. This is the placement step, and the quality of the initial placement directly affects how many SWAP gates routing needs to insert.
TKET provides three main placement strategies:
-
LinePlacement finds the longest chain of interacting qubits in the circuit and maps them onto a contiguous line of physical qubits. This works well when the circuit has a mostly linear interaction pattern.
-
GraphPlacement uses subgraph isomorphism to find the best mapping of the circuit’s interaction graph onto the hardware’s connectivity graph. This is the most general strategy and works well for arbitrary circuits.
-
NoiseAwarePlacement extends
GraphPlacementby incorporating device calibration data. It prefers physical qubits with lower gate error rates and longer coherence times. This requires a backend object that provides noise information.
from pytket.architecture import Architecture
from pytket.placement import LinePlacement, GraphPlacement
from pytket.circuit import Circuit
# A T-shaped architecture:
# 0 - 1 - 2
# |
# 3
# |
# 4
arch = Architecture([(0, 1), (1, 2), (1, 3), (3, 4)])
# LinePlacement finds a line through the architecture
line_place = LinePlacement(arch)
# GraphPlacement uses subgraph matching
graph_place = GraphPlacement(arch)
# Build a circuit with a specific interaction pattern
circ = Circuit(4)
circ.CX(0, 1).CX(1, 2).CX(2, 3)
circ.CX(0, 2) # This non-local interaction forces at least one SWAP
# Try both placements
import copy
circ_line = copy.deepcopy(circ)
circ_graph = copy.deepcopy(circ)
line_place.place(circ_line)
graph_place.place(circ_graph)
print("Line placement mapping:")
print(f" {circ_line.qubit_readout()}")
print("Graph placement mapping:")
print(f" {circ_graph.qubit_readout()}")
Qubit placement matters because a good initial mapping can place frequently interacting qubits on adjacent physical qubits, reducing the number of SWAPs the router needs to insert. On a device with 100+ qubits, a poor placement can double or triple the two-qubit gate count after routing.
Routing and SWAP Overhead Across Topologies
Once a circuit is rebased to native gates, it still needs to be routed onto the hardware connectivity graph. Different hardware architectures impose different connectivity constraints, and the topology has a dramatic effect on SWAP overhead.
CXMappingPass
CXMappingPass simultaneously places logical qubits and inserts SWAP gates to satisfy connectivity. Each SWAP decomposes into 3 CX gates, so minimizing SWAPs is critical for circuit fidelity.
from pytket.passes import CXMappingPass, DefaultMappingPass
from pytket.architecture import Architecture
from pytket.placement import GraphPlacement
from pytket.circuit import Circuit
# Define a linear connectivity: 0-1-2-3-4
arch = Architecture([(0, 1), (1, 2), (2, 3), (3, 4)])
placement = GraphPlacement(arch)
mapping_pass = CXMappingPass(
arch,
placement,
directed_cx=True,
delay_measures=True,
)
def make_circuit():
c = Circuit(5)
for i in range(4):
c.CX(i, i + 1)
for i in range(5):
c.Rz(0.3, i).H(i)
for i in range(4):
c.CX(i + 1, i)
for i in range(5):
c.Rz(0.7, i)
return c
def profile(label, c):
print(f"{label:35s} gates={c.n_gates:4d} depth={c.depth():4d} "
f"2q_gates={c.n_2qb_gates():4d}")
circ_to_route = make_circuit()
SynthesiseTket().apply(circ_to_route)
mapping_pass.apply(circ_to_route)
profile("After routing", circ_to_route)
directed_cx=True preserves the orientation of CX gates to match the hardware’s native direction, avoiding extra overhead from reversing gate direction. delay_measures=True pushes measurements as late as possible, which reduces decoherence on measured qubits.
Comparing Topologies
The following example builds a fully-connected circuit on 5 qubits (every qubit interacts with every other) and routes it onto three different topologies to compare SWAP overhead.
from pytket.circuit import Circuit, OpType
from pytket.architecture import Architecture
from pytket.placement import GraphPlacement
from pytket.passes import CXMappingPass, SynthesiseTket, RemoveRedundancies
import copy
def make_fully_connected_circuit(n_qubits=5):
"""Create a circuit where every pair of qubits interacts."""
circ = Circuit(n_qubits)
for i in range(n_qubits):
for j in range(i + 1, n_qubits):
circ.CX(i, j)
circ.Rz(0.1 * (i + j), j)
return circ
# Three topologies for 5 qubits
# Linear chain: 0-1-2-3-4
linear = Architecture([(i, i + 1) for i in range(4)])
# Star: qubit 2 is the hub
star = Architecture([(2, 0), (2, 1), (2, 3), (2, 4)])
# Grid (2x3 with 5 qubits used):
# 0 - 1 - 2
# | |
# 3 - 4
grid = Architecture([(0, 1), (1, 2), (0, 3), (1, 4), (3, 4)])
topologies = [
("Linear chain", linear),
("Star", star),
("Grid (2x3)", grid),
]
base_circ = make_fully_connected_circuit()
SynthesiseTket().apply(base_circ)
print(f"{'Topology':20s} {'2Q gates':>10s} {'Total gates':>12s} {'Depth':>6s}")
print("-" * 55)
for name, arch in topologies:
circ = copy.deepcopy(base_circ)
placement = GraphPlacement(arch)
routing = CXMappingPass(arch, placement, directed_cx=False)
routing.apply(circ)
RemoveRedundancies().apply(circ)
print(f"{name:20s} {circ.n_2qb_gates():10d} {circ.n_gates:12d} {circ.depth():6d}")
Linear chains produce the most SWAP overhead because distant qubits must communicate through a chain of intermediaries. The grid topology provides shorter paths between most qubit pairs, and the star topology excels when one qubit interacts with many others.
Custom Pass from Scratch Using BasePass
TKET allows you to define custom passes that compose with built-in passes using SequencePass. The simplest way to create a custom pass is with CustomPass, which wraps a function that transforms a circuit.
from pytket.passes import CustomPass, SequencePass, RemoveRedundancies
from pytket.circuit import Circuit, OpType
def cx_counter(circ):
"""A custom pass that logs CX gate statistics."""
cx_count = sum(
1 for cmd in circ.get_commands() if cmd.op.type == OpType.CX
)
total = circ.n_gates
print(f" [CX Counter] {cx_count} CX gates out of {total} total "
f"({100 * cx_count / max(total, 1):.1f}%)")
return circ
counter_pass = CustomPass(cx_counter)
# Use it in a pipeline alongside built-in passes
circ = Circuit(3)
circ.H(0).CX(0, 1).CX(1, 2).CX(2, 1).CX(1, 0)
circ.Rz(0.5, 0).Rz(-0.5, 0) # cancels to identity
print("Before optimization:")
counter_pass.apply(circ)
pipeline = SequencePass([
RemoveRedundancies(),
counter_pass,
])
print("After optimization:")
pipeline.apply(circ)
For more complex custom passes that need to track state across invocations, you can use a closure or a class:
from pytket.passes import CustomPass, SequencePass, CommuteThroughMultis
from pytket.circuit import Circuit, OpType
class PassProfiler:
"""Records gate counts at each invocation, for later analysis."""
def __init__(self, label):
self.label = label
self.history = []
def __call__(self, circ):
record = {
"label": self.label,
"n_gates": circ.n_gates,
"depth": circ.depth(),
"n_2qb": circ.n_2qb_gates(),
}
self.history.append(record)
return circ
def as_pass(self):
return CustomPass(self)
# Create profilers for each stage
prof_before = PassProfiler("before")
prof_after = PassProfiler("after_commute")
pipeline = SequencePass([
prof_before.as_pass(),
CommuteThroughMultis(),
prof_after.as_pass(),
])
circ = Circuit(4)
for i in range(3):
circ.CX(i, i + 1)
circ.Rz(0.3, 0).Rz(0.7, 1)
for i in range(3):
circ.CX(i, i + 1)
pipeline.apply(circ)
print("Profile results:")
for record in prof_before.history + prof_after.history:
print(f" {record['label']:20s} gates={record['n_gates']} "
f"depth={record['depth']} 2qb={record['n_2qb']}")
Measuring Pass Effectiveness with Metrics
When building a compilation pipeline, you need to know which passes actually contribute to gate count and depth reduction. The following profiler applies each pass individually to a copy of the circuit and records the metrics at every stage.
from pytket.circuit import Circuit, OpType
from pytket.passes import (
CommuteThroughMultis,
RemoveRedundancies,
SynthesiseTket,
FullPeepholeOptimise,
CliffordSimp,
PeepholeOptimise2Q,
)
import copy
def profile(label, c):
return {
"label": label,
"gates": c.n_gates,
"depth": c.depth(),
"2qb": c.n_2qb_gates(),
}
def build_random_circuit(n_qubits=10, seed=42):
"""Build a realistic test circuit with mixed gate types."""
import random
random.seed(seed)
circ = Circuit(n_qubits)
for _ in range(60):
gate_type = random.choice(["cx", "h", "rz", "t", "s", "cx"])
if gate_type == "cx":
q1, q2 = random.sample(range(n_qubits), 2)
circ.CX(q1, q2)
elif gate_type == "h":
circ.H(random.randint(0, n_qubits - 1))
elif gate_type == "rz":
circ.Rz(random.uniform(0, 2), random.randint(0, n_qubits - 1))
elif gate_type == "t":
circ.T(random.randint(0, n_qubits - 1))
elif gate_type == "s":
circ.S(random.randint(0, n_qubits - 1))
return circ
# Build the test circuit
base_circ = build_random_circuit(n_qubits=10)
# Define the passes to benchmark
passes = [
("CommuteThroughMultis", CommuteThroughMultis()),
("RemoveRedundancies", RemoveRedundancies()),
("SynthesiseTket", SynthesiseTket()),
("CliffordSimp", CliffordSimp()),
("PeepholeOptimise2Q", PeepholeOptimise2Q()),
]
# Apply passes cumulatively and record metrics at each stage
results = []
circ = copy.deepcopy(base_circ)
results.append(profile("Initial", circ))
for label, p in passes:
p.apply(circ)
results.append(profile(f"After {label}", circ))
# Also benchmark FullPeepholeOptimise as a baseline
circ_full = copy.deepcopy(base_circ)
FullPeepholeOptimise().apply(circ_full)
results.append(profile("FullPeepholeOptimise", circ_full))
# Print results table
print(f"{'Stage':35s} {'Gates':>6s} {'Depth':>6s} {'2Q Gates':>8s}")
print("-" * 60)
for r in results:
print(f"{r['label']:35s} {r['gates']:6d} {r['depth']:6d} {r['2qb']:8d}")
# Compute per-pass contribution
print("\nPer-pass gate reduction:")
for i in range(1, len(results) - 1): # exclude FullPeepholeOptimise row
prev = results[i - 1]
curr = results[i]
delta = prev["gates"] - curr["gates"]
print(f" {curr['label']:35s} {delta:+4d} gates "
f"({delta:+4d} from {prev['gates']})")
This profiling approach helps you identify which passes to keep and which add compilation time without meaningful improvement for your specific circuit family. For example, CliffordSimp is very effective on circuits from fault-tolerant synthesis (which are Clifford-heavy) but may do nothing on variational circuits that are dominated by parameterized rotations.
Profiling Each Stage
To understand where optimization wins come from in a specific pipeline, apply passes individually and record metrics at each step.
from pytket.passes import (
CommuteThroughMultis,
RemoveRedundancies,
SynthesiseTket,
FullPeepholeOptimise,
)
from pytket.circuit import Circuit
import copy
def profile(label, c):
print(f"{label:35s} gates={c.n_gates:4d} depth={c.depth():4d} "
f"2q_gates={c.n_2qb_gates():4d}")
# Build a moderately complex circuit
def make_circuit():
c = Circuit(5)
for i in range(4):
c.CX(i, i + 1)
for i in range(5):
c.Rz(0.3, i).H(i)
for i in range(4):
c.CX(i + 1, i)
for i in range(5):
c.Rz(0.7, i)
return c
stages = [
("CommuteThroughMultis", CommuteThroughMultis()),
("RemoveRedundancies", RemoveRedundancies()),
("SynthesiseTket", SynthesiseTket()),
]
circ = make_circuit()
profile("Initial", circ)
for label, p in stages:
p.apply(circ)
profile(f"After {label}", circ)
# Compare against the all-in-one shortcut
circ_full = make_circuit()
FullPeepholeOptimise().apply(circ_full)
profile("FullPeepholeOptimise", circ_full)
This kind of profiling reveals which passes contribute the most reduction for your specific circuit structure. For circuits dominated by Clifford gates, SynthesiseTket tends to dominate. For circuits with many commuting single-qubit gates, CommuteThroughMultis followed by RemoveRedundancies gives the most improvement.
Assembling a Full Pipeline
A production pipeline typically follows this order: synthesize, optimize, rebase, route, clean up.
The ordering matters. Synthesis and optimization should happen first because they reduce the gate count and simplify the circuit structure before routing. Rebasing converts to the target gate set so that routing inserts SWAPs in the correct basis. The final cleanup catches cancellations introduced by SWAP decomposition.
from pytket.passes import (
SequencePass, SynthesiseTket, RemoveRedundancies,
CommuteThroughMultis, CliffordSimp,
)
full_pipeline = SequencePass([
# Phase 1: high-level optimization
CommuteThroughMultis(),
RemoveRedundancies(),
SynthesiseTket(),
CliffordSimp(),
# Phase 2: rebase to hardware native gates
ibm_rebase, # from the earlier example
# Phase 3: routing onto hardware connectivity
mapping_pass, # routing pass defined above
# Phase 4: post-routing cleanup
RemoveRedundancies(),
])
circ_final = make_circuit()
full_pipeline.apply(circ_final)
profile("Full pipeline", circ_final)
The second RemoveRedundancies after routing catches cancellations that routing sometimes introduces via adjacent SWAP decompositions. Running it again is cheap and usually reduces 2-qubit gate count by a few percent.
Common Mistakes
Five pitfalls that frequently cause subtle problems in TKET compilation pipelines:
1. Applying routing before rebase. If you route a circuit that still contains high-level gates (like H or T), the router inserts SWAPs composed of those gates. When you then rebase to the target gate set, each SWAP expands further, potentially doubling the gate count. Always rebase before routing, so that SWAPs are inserted in the native gate set and do not need re-expansion.
# Wrong order: route then rebase
wrong_pipeline = SequencePass([
mapping_pass, # inserts SWAPs as CX triples in TK1/CX basis
ibm_rebase, # re-expands every TK1 into Rz/SX, inflating gate count
])
# Correct order: rebase then route
correct_pipeline = SequencePass([
ibm_rebase, # convert to native gates first
mapping_pass, # SWAPs are now in native CX, no re-expansion needed
RemoveRedundancies(),
])
2. Forgetting that passes mutate circuits in place. If you apply a pass to a circuit and then want to compare it against the original, the original is gone. Always use copy.deepcopy before applying passes when you need to preserve the original for comparison or profiling.
import copy
circ = make_circuit()
circ_backup = copy.deepcopy(circ) # preserve original
SynthesiseTket().apply(circ)
# Now you can compare circ (optimized) vs circ_backup (original)
print(f"Before: {circ_backup.n_gates} gates")
print(f"After: {circ.n_gates} gates")
3. Using FullPeepholeOptimise after a custom rebase. FullPeepholeOptimise internally assumes the TK1 + CX gate basis. If you have already rebased to a different gate set (like Rz + SX + CX for IBM), FullPeepholeOptimise will first convert back to TK1 + CX, optimize, and leave the result in TK1 + CX. You then need to rebase again, which can introduce extra gates. If you use FullPeepholeOptimise, apply it before your custom rebase, not after.
4. Ignoring directed CX constraints. Some hardware only supports CX in one direction (for example, CX(0,1) but not CX(1,0)). If you set directed_cx=False in the routing pass, the compiler may insert CX gates in the wrong direction. The hardware backend then reverses them using H-CX-H sandwiches, adding 2 extra H gates per reversed CX. Set directed_cx=True and let the router handle directionality during placement.
# With directed_cx=True, the router respects hardware CX direction
# and avoids the H-CX-H overhead for reversed gates
mapping_pass_directed = CXMappingPass(
arch,
placement,
directed_cx=True, # respect hardware direction
delay_measures=True,
)
5. Not running RemoveRedundancies after routing. SWAP decomposition breaks each SWAP into 3 CX gates. When two SWAPs are adjacent (which happens at topology bottlenecks), the resulting 6 CX gates often contain cancellable pairs. A single RemoveRedundancies pass after routing typically removes 5-15% of the two-qubit gates introduced by routing. Skipping this step leaves free performance on the table.
Custom pass pipelines are how production quantum software stacks achieve the circuit fidelity needed to run meaningful computations on NISQ hardware. The pytket pass system gives you the composability to experiment with pass ordering without rewriting your circuit construction code.
Was this tutorial helpful?