Advanced Circuit Optimization with tket

Why Advanced Compilation Matters

Raw circuit depth is the enemy of quantum computation on NISQ hardware. Every additional two-qubit gate accumulates error. A well-compiled circuit for a 20-qubit VQE might have 40% fewer CNOT gates than a naively assembled one, the difference between a meaningful result and noise.

tket (from Quantinuum) is a backend-agnostic compilation framework built around a composable pass system. Unlike hardware-vendor SDKs that optimize only for their own devices, tket exports to IBM, IonQ, Quantinuum, AWS Braket, Rigetti, and others from a single circuit object.

This tutorial assumes you have pytket installed:

pip install pytket
pip install pytket-qiskit pytket-quantinuum pytket-braket  # backend extensions

(IonQ devices are reached through the pytket-braket or pytket-azure extensions; the old standalone pytket-ionq extension is no longer maintained.)

The Pass System

tket’s compiler is a pipeline of BasePass objects. Each pass transforms the circuit in place. Passes compose with SequencePass (run in order) and RepeatUntilSatisfiedPass (run until a predicate holds).

from pytket.passes import (
    SequencePass,
    CliffordSimp,
    PauliSimp,
    FullPeepholeOptimise,
    DecomposeBoxes,
    SynthesiseTK2,
)
from pytket.circuit import Circuit

circ = Circuit(4)
circ.H(0).CX(0, 1).CX(1, 2).CX(2, 3)
circ.Rz(0.25, 0).CX(0, 1).Rz(-0.25, 1).CX(0, 1)

# Build a custom optimization pipeline
my_pass = SequencePass([
    DecomposeBoxes(),
    CliffordSimp(),
    PauliSimp(),
    FullPeepholeOptimise(),
])

my_pass.apply(circ)
print(f"Two-qubit gate count: {circ.n_2qb_gates()}")

CliffordSimp: Exploiting Clifford Structure

Clifford gates (H, S, CX, CZ, SWAP) form a group with well-understood algebra. CliffordSimp recognizes patterns of Clifford gates and replaces them with shorter equivalent sequences by propagating Pauli operators through the circuit using the tableau representation.

from pytket.circuit import Circuit
from pytket.passes import CliffordSimp

circ = Circuit(3)
# Redundant Clifford structure
circ.H(0).CX(0, 1).H(0).H(0).CX(0, 1).H(0)
circ.CX(1, 2).CX(2, 1).CX(1, 2)   # SWAP via 3 CNOTs

before = circ.n_gates
CliffordSimp().apply(circ)
after = circ.n_gates

print(f"Gates before: {before}, after: {after}")

CliffordSimp can reduce circuits dominated by Clifford gates dramatically, sometimes eliminating entire blocks that cancel to identity.

PauliSimp: Optimizing Pauli Exponentials

Many quantum chemistry and QAOA circuits are built from Pauli exponential terms: exp(-i * theta/2 * P) where P is a tensor product of Pauli operators. PauliSimp converts the circuit to a sequence of these exponentials, recombines commuting terms, and then re-synthesizes into a minimal CNOT ladder.

from pytket.circuit import Circuit
from pytket.passes import PauliSimp
from pytket.pauli import Pauli, QubitPauliString
from pytket.utils import QubitPauliOperator

circ = Circuit(4)
# Simulate a chemistry-inspired circuit
circ.H(0)
circ.CX(0, 1).Rz(0.3, 1).CX(0, 1)
circ.CX(1, 2).Rz(0.5, 2).CX(1, 2)
circ.CX(2, 3).Rz(0.7, 3).CX(2, 3)
circ.H(0)

before_2q = circ.n_2qb_gates()
PauliSimp().apply(circ)
after_2q = circ.n_2qb_gates()

print(f"Two-qubit gates before: {before_2q}, after: {after_2q}")

PauliSimp is especially powerful for variational ansatze that use UCCSD (unitary coupled cluster) excitation operators, often halving or better the two-qubit gate count.

KAK Decomposition vs SynthesiseTket

The KAK decomposition is the canonical way to decompose an arbitrary two-qubit unitary into at most 3 CNOT gates plus single-qubit rotations. tket’s SynthesiseTK2 pass applies KAK to decompose arbitrary two-qubit unitaries into TK2 gates (tket’s native parameterized two-qubit gate), then a backend-specific pass converts TK2 into hardware-native gates.

SynthesiseTket is the general-purpose synthesis pass that decomposes circuits into the {CX, TK1} basis with peephole optimizations. (Older docs mention a SynthesiseIBM pass; it was removed in pytket 1.0, and SynthesiseTket plus a rebase is the current equivalent.)

from pytket.circuit import Circuit, OpType
from pytket.passes import SynthesiseTK2, SynthesiseTket, FullPeepholeOptimise

def count_2q(circ):
    return circ.n_2qb_gates()

# Generic KAK via TK2
circ1 = Circuit(2)
circ1.Unitary2qBox([[1,0,0,0],[0,0,1,0],[0,1,0,0],[0,0,0,1]], 0, 1)

from pytket.passes import DecomposeBoxes
DecomposeBoxes().apply(circ1)
SynthesiseTK2().apply(circ1)
print(f"TK2 path, 2q gates: {count_2q(circ1)}")

circ2 = Circuit(2)
circ2.Unitary2qBox([[1,0,0,0],[0,0,1,0],[0,1,0,0],[0,0,0,1]], 0, 1)
DecomposeBoxes().apply(circ2)
SynthesiseTket().apply(circ2)
print(f"CX path, 2q gates: {count_2q(circ2)}")

For circuits targeting IBM backends, SynthesiseTket followed by FullPeepholeOptimise and a final rebase into the device gate set generally gives the best results. For other backends, start with SynthesiseTK2 and then apply the backend-specific rebase pass.

Routing with Architecture Graphs

Real devices only allow two-qubit gates between physically connected qubit pairs. tket’s routing system inserts SWAP gates to move qubit state to adjacent positions. You describe the topology as an Architecture object.

from pytket.architecture import Architecture
from pytket.passes import DefaultMappingPass, RoutingPass, PlacementPass
from pytket.placement import GraphPlacement, NoiseAwarePlacement
from pytket.circuit import Circuit

# Define a linear 5-qubit chain topology: 0-1-2-3-4
edges = [(0, 1), (1, 2), (2, 3), (3, 4)]
arch = Architecture(edges)

circ = Circuit(5)
circ.CX(0, 2)   # not directly connected -- needs routing
circ.CX(1, 4)   # also needs routing
circ.CX(0, 4)

# GraphPlacement minimizes SWAP overhead by graph matching
GraphPlacement(arch).place(circ)
RoutingPass(arch).apply(circ)

print(f"Circuit after routing: {circ.n_2qb_gates()} two-qubit gates")

The GraphPlacement pass solves a subgraph isomorphism to find the best initial mapping of logical to physical qubits before routing begins. A good initial placement dramatically reduces the number of SWAPs inserted.

Noise-Aware Placement

NoiseAwarePlacement goes further by using device calibration data (gate error rates and readout errors) to place qubits on the highest-fidelity physical locations.

from pytket.placement import NoiseAwarePlacement
from pytket.backends.backendinfo import BackendInfo
from pytket.architecture import Architecture

# In practice you get BackendInfo from a backend object, e.g.:
# from pytket.extensions.qiskit import IBMQBackend
# backend = IBMQBackend("ibm_torino")   # any currently available IBM device
# backend_info = backend.backend_info

# For illustration, build a minimal BackendInfo manually
arch = Architecture([(0,1),(1,2),(2,3),(3,4)])

# With a real backend:
# NoiseAwarePlacement(arch, backend_info.gate_errors,
#                    backend_info.readout_errors,
#                    backend_info.link_errors).place(circ)

# Noise-aware placement chooses physical qubits with lowest error rates
# and maps entangling operations to the highest-fidelity links

On a real IBM backend the difference between naive placement and noise-aware placement can be 10-30% in circuit fidelity for medium-depth circuits.

Custom PassManager Pipelines

For production workloads, assemble a full pipeline tailored to your target backend:

from pytket.passes import (
    SequencePass,
    DecomposeBoxes,
    PauliSimp,
    CliffordSimp,
    FullPeepholeOptimise,
    AutoRebase,
    SynthesiseTket,
)
from pytket.circuit import OpType
from pytket.architecture import Architecture
from pytket.passes import DefaultMappingPass
from pytket.placement import NoiseAwarePlacement

# IBM Heron gate set; older Eagle devices used CX instead of CZ
IBM_GATESET = {OpType.CZ, OpType.Rz, OpType.SX, OpType.X}

def compile_for_ibm(circ, arch, backend_info=None):
    pipeline = SequencePass([
        DecomposeBoxes(),
        PauliSimp(),
        CliffordSimp(),
        SynthesiseTket(),
        FullPeepholeOptimise(),
        DefaultMappingPass(arch),  # handles placement + routing
        FullPeepholeOptimise(allow_swaps=False),  # clean up after routing
        AutoRebase(IBM_GATESET),   # rebase last so the gate set is preserved
    ])
    pipeline.apply(circ)
    return circ

The double FullPeepholeOptimise (once before routing to reduce gates, once after to clean up SWAPs) is a common pattern that pays off in lower final gate counts.

Backend-Agnostic Portability

The same optimized Circuit object can be dispatched to different backends by applying the appropriate rebase pass at the end. Modern pytket uses a single AutoRebase pass parameterized by the target gate set (the hardware-specific passes like RebaseIBM that appear in older docs were removed in pytket 1.0):

from pytket.passes import AutoRebase
from pytket.circuit import Circuit, OpType

circ = Circuit(3)
circ.H(0).CX(0,1).CX(1,2).Rz(0.5, 2)

# For IBM Heron devices (CZ + RZ, SX, X basis)
circ_ibm = circ.copy()
AutoRebase({OpType.CZ, OpType.Rz, OpType.SX, OpType.X}).apply(circ_ibm)

# For Quantinuum H-series (ZZPhase + PhasedX, Rz basis)
circ_h = circ.copy()
AutoRebase({OpType.ZZPhase, OpType.PhasedX, OpType.Rz}).apply(circ_h)

print("IBM gates:", circ_ibm.n_gates)
print("Quantinuum gates:", circ_h.n_gates)

In practice you rarely build the rebase by hand: every backend extension provides backend.default_compilation_pass(), which ends with the correct rebase for that device (including IonQ devices reached through Braket or Azure).

This backend-agnostic workflow is one of tket’s most valuable properties for teams that run on multiple hardware platforms or benchmark across providers.

Comparing Optimization Passes

A practical comparison for a 6-qubit QAOA circuit:

Pass Combination	2Q Gate Count	Depth
No optimization	48	62
CliffordSimp only	38	51
PauliSimp only	29	40
PauliSimp + CliffordSimp	24	35
Full pipeline (above)	19	28

Numbers are illustrative and circuit-dependent, but the trend is consistent: combining Pauli-level and Clifford-level optimization before routing, then cleaning up after routing, beats any single pass by a significant margin.

Next Steps

Use pytket-quantinuum to submit to H-series trapped-ion hardware, which natively supports ZZPhase gates for efficient Hamiltonian simulation.
Explore pytket-braket for AWS Braket devices (including IonQ, IQM, and Rigetti hardware).
For error mitigation on top of compiled circuits, combine tket with Mitiq; tket handles compilation, Mitiq wraps execution with zero-noise extrapolation or probabilistic error cancellation.