Circuit Optimisation with tket

What tket Does

tket (accessed via the pytket Python package) is a hardware-agnostic quantum compiler developed by Quantinuum. Unlike Qiskit’s transpiler, which is tightly integrated with IBM hardware, tket is designed as a standalone compilation layer that can accept circuits from multiple frameworks (Qiskit, Cirq, PyQuil) and target multiple backends (IBM, IonQ, Quantinuum, Rigetti, and simulators).

The core value of tket is its optimization passes. These are composable transformations that reduce circuit depth and gate count without changing the circuit’s logical operation. Because fewer gates means less noise on real hardware, optimization directly improves result quality.

Key capabilities:

Gate count reduction via peephole optimization and commutation rules
Hardware-aware qubit routing with SWAP insertion
Native gate decomposition for specific hardware gate sets
Circuit rebalancing to reduce critical path depth

Install

pip install pytket pytket-qiskit

pytket is the core package. pytket-qiskit provides converters between tket’s internal Circuit representation and Qiskit’s QuantumCircuit, plus the ability to target IBM backends directly from tket.

Importing a Qiskit Circuit into tket

Start with a Qiskit circuit and convert it to a tket Circuit:

from qiskit import QuantumCircuit
from pytket.extensions.qiskit import qiskit_to_tk, tk_to_qiskit

# Build a Qiskit circuit: a 4-qubit circuit with some redundancy
qc = QuantumCircuit(4)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.cx(2, 3)
qc.cx(2, 3)   # Immediately repeated: cancels to identity
qc.cx(1, 2)   # Same: cancels
qc.rz(1.5708, 0)   # pi/2 rotation
qc.rz(1.5708, 0)   # Two pi/2 rotations = one pi rotation (reducible)
qc.h(1)
qc.cx(1, 2)
qc.h(1)        # H CX H = CZ decomposed: tket can detect this pattern

print("Qiskit circuit:")
print(qc.draw())
print(f"Gate count (Qiskit): {qc.size()}")
print(f"Depth (Qiskit): {qc.depth()}")

# Convert to tket Circuit
tk_circuit = qiskit_to_tk(qc)
print(f"\ntket gate count (before opt): {tk_circuit.n_gates}")
print(f"tket depth (before opt): {tk_circuit.depth()}")

Applying Optimization Passes

tket passes are applied via pass.apply(circuit). Passes can be composed into a SequencePass for multi-stage optimization.

from pytket.passes import (
    SequencePass,
    FullPeepholeOptimise,
    DecomposeBoxes,
    RemoveRedundancies,
    CommuteThroughMultis,
)

# Stage 1: Decompose any high-level box operations into primitive gates
# Stage 2: Remove redundant gate pairs (CX CX = I, etc.)
# Stage 3: Commute gates through each other to find more cancellations
# Stage 4: Full peephole optimization: scan windows and apply optimal replacements
optimisation_pass = SequencePass([
    DecomposeBoxes(),
    RemoveRedundancies(),
    CommuteThroughMultis(),
    FullPeepholeOptimise(),
    RemoveRedundancies(),    # Run again after peephole may have created new redundancies
])

# Apply the pass in-place
optimisation_pass.apply(tk_circuit)

print(f"\ntket gate count (after opt): {tk_circuit.n_gates}")
print(f"tket depth (after opt): {tk_circuit.depth()}")

# Convert back to Qiskit to compare
qc_optimised = tk_to_qiskit(tk_circuit)
print(f"\nQiskit gate count (after tket opt): {qc_optimised.size()}")
print(f"Qiskit depth (after tket opt): {qc_optimised.depth()}")

Typical results for circuits with redundant gates:

Metric	Before	After
Gate count	11	5
Circuit depth	9	4
Two-qubit gates	5	2

The repeated CX pairs are cancelled immediately by RemoveRedundancies. The back-to-back Rz rotations are merged into a single Rz. The H-CX-H pattern is recognized as a CZ and either left as-is or compiled to a shorter native sequence depending on the target backend.

Gate Count Breakdown

To understand what tket produces in detail, inspect the gate type histogram:

from pytket.circuit import OpType

def gate_count_by_type(circuit):
    """Return a dict of gate type to count."""
    counts = {}
    for cmd in circuit.get_commands():
        op_str = str(cmd.op.type)
        counts[op_str] = counts.get(op_str, 0) + 1
    return counts

before_counts = gate_count_by_type(qiskit_to_tk(qc))   # Re-import original
after_counts = gate_count_by_type(tk_circuit)

print("Gate type breakdown before optimization:")
for gate, count in sorted(before_counts.items()):
    print(f"  {gate}: {count}")

print("\nGate type breakdown after optimization:")
for gate, count in sorted(after_counts.items()):
    print(f"  {gate}: {count}")

Backend-Specific Compilation for IBM Hardware

For real hardware, you also need to decompose gates into the backend’s native gate set and route qubits to respect the hardware connectivity. tket handles both:

from pytket.extensions.qiskit import IBMQBackend

# Connect to an IBM backend (requires IBM Quantum credentials)
# Device names change as IBM retires hardware (ibm_brisbane was retired in
# November 2025), so check your account's device list for a current name
# backend = IBMQBackend("ibm_pittsburgh")

# For offline testing, use the emulator backend
from pytket.extensions.qiskit import IBMQEmulatorBackend
# backend = IBMQEmulatorBackend("ibm_pittsburgh")

# If you have a backend object:
# compiled_circuit = backend.get_compiled_circuit(tk_circuit, optimisation_level=2)
# optimisation_level 0: only routing, 1: light opt, 2: full opt

# Without a live backend, show the manual compilation pass for an IBM-style gate set
# Current IBM Heron QPUs natively use CZ, RZ, SX, X (older Eagle devices used ECR);
# here we target a CX-based set, which AutoRebase converts to.
# Note: older docs show a RebaseIBM pass, which has been removed from pytket.
from pytket.passes import (
    AutoRebase,
    FullPeepholeOptimise,
    PlacementPass,
    RoutingPass,
)
from pytket.placement import GraphPlacement
from pytket.architecture import Architecture

# Define a simple linear 4-qubit connectivity (like a subset of IBM devices)
arch = Architecture([(0, 1), (1, 2), (2, 3)])

# Build the IBM-targeted compilation sequence
ibm_pass = SequencePass([
    DecomposeBoxes(),
    FullPeepholeOptimise(),
    PlacementPass(GraphPlacement(arch)),
    RoutingPass(arch),
    AutoRebase({OpType.CX, OpType.Rz, OpType.SX, OpType.X}),
    RemoveRedundancies(),
])

# Apply on a fresh copy of the original circuit
tk_for_ibm = qiskit_to_tk(qc)
ibm_pass.apply(tk_for_ibm)

print(f"Gate count (IBM compiled): {tk_for_ibm.n_gates}")
print(f"Depth (IBM compiled):      {tk_for_ibm.depth()}")
print(f"CX count (IBM compiled):   {tk_for_ibm.n_gates_of_type(OpType.CX)}")

Routing and SWAP Insertion

Real quantum hardware has limited qubit connectivity. For instance, IBM’s heavy-hex architecture only connects nearest-neighbor qubits. If your circuit requires a two-qubit gate between non-adjacent qubits, the compiler must insert SWAP gates to move qubit states along the device graph.

tket’s routing pass:

Finds an initial qubit placement that minimizes the number of required SWAPs (placement).
Inserts SWAP gates wherever the circuit requires non-adjacent two-qubit gates (routing).
Applies further optimizations to cancel SWAPs introduced during routing (cleanup).

The GraphPlacement strategy places logical qubits onto physical qubits by solving a subgraph isomorphism between the circuit’s interaction graph and the device coupling map. For circuits with regular structure (such as linear chains or grids), this placement is near-optimal.

Each SWAP costs 3 CX gates in IBM’s gate set, so routing quality has a significant effect on circuit depth.

Roundtrip Execution

After tket optimization, convert back to Qiskit for execution on Qiskit-based simulators or IBM hardware:

from qiskit_aer import AerSimulator
from qiskit import transpile

# Convert tket circuit back to Qiskit
qc_from_tket = tk_to_qiskit(tk_circuit)

# Add measurements
qc_from_tket.measure_all()

# Run on Aer
sim = AerSimulator()
compiled = transpile(qc_from_tket, sim)
result = sim.run(compiled, shots=1024).result()
counts = result.get_counts()
print("Measurement results:", counts)

The roundtrip is lossless for standard gate sets. tket’s internal representation uses exact angle arithmetic, so no floating-point degradation occurs when converting Rz angles between frameworks.

Comparing tket vs Qiskit Transpiler

For general-purpose optimization of circuits that will run on IBM hardware:

Feature	tket	Qiskit transpiler
Framework support	Qiskit, Cirq, PyQuil, more	Qiskit native
Peephole optimization	FullPeepholeOptimise	optimization_level=3
Routing algorithms	GraphPlacement, LinePlacement	SABRE, basic
Custom pass composition	SequencePass, RepeatPass	PassManager
Native gate targets	IBM, IonQ, Quantinuum, more	IBM (primary)

For circuits that will run on non-IBM hardware, tket is often the better choice because it has first-class support for hardware-native gate sets (like Quantinuum’s or IonQ’s native gates) without needing framework-specific transpilers.

For pure IBM workflows, Qiskit’s transpile with optimization_level=3 and tket’s FullPeepholeOptimise produce comparable results on most benchmarks. The difference tends to be largest for deep circuits with many two-qubit gates.

Summary

Step	tket API
Import from Qiskit	`qiskit_to_tk(qc)`
Remove redundant gates	`RemoveRedundancies().apply(c)`
Full peephole optimization	`FullPeepholeOptimise().apply(c)`
Place qubits on hardware	`PlacementPass(GraphPlacement(arch)).apply(c)`
Route and insert SWAPs	`RoutingPass(arch).apply(c)`
Rebase to a hardware gate set	`AutoRebase({OpType.CX, OpType.Rz, OpType.SX, OpType.X}).apply(c)`
Export to Qiskit	`tk_to_qiskit(c)`
Count gates by type	`c.n_gates_of_type(OpType.CX)`

tket’s composable pass system makes it straightforward to mix and match optimization strategies and target different hardware backends from the same circuit definition. For teams running experiments across multiple hardware providers, this hardware-agnostic approach can significantly reduce the porting overhead.