Circuit Optimisation with tket
How to use tket (pytket) to optimise quantum circuits, reduce gate counts, and compile for specific hardware backends without rewriting your circuits.
What tket Does
tket (accessed via the pytket Python package) is a hardware-agnostic quantum compiler developed by Quantinuum. Unlike Qiskit’s transpiler, which is tightly integrated with IBM hardware, tket is designed as a standalone compilation layer that can accept circuits from multiple frameworks (Qiskit, Cirq, PyQuil) and target multiple backends (IBM, IonQ, Quantinuum, Rigetti, and simulators).
The core value of tket is its optimization passes. These are composable transformations that reduce circuit depth and gate count without changing the circuit’s logical operation. Because fewer gates means less noise on real hardware, optimization directly improves result quality.
Key capabilities:
- Gate count reduction via peephole optimization and commutation rules
- Hardware-aware qubit routing with SWAP insertion
- Native gate decomposition for specific hardware gate sets
- Circuit rebalancing to reduce critical path depth
Install
pip install pytket pytket-qiskit
pytket is the core package. pytket-qiskit provides converters between tket’s internal Circuit representation and Qiskit’s QuantumCircuit, plus the ability to target IBM backends directly from tket.
Importing a Qiskit Circuit into tket
Start with a Qiskit circuit and convert it to a tket Circuit:
from qiskit import QuantumCircuit
from pytket.extensions.qiskit import qiskit_to_tk, tk_to_qiskit
# Build a Qiskit circuit: a 4-qubit circuit with some redundancy
qc = QuantumCircuit(4)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.cx(2, 3)
qc.cx(2, 3) # Immediately repeated: cancels to identity
qc.cx(1, 2) # Same: cancels
qc.rz(1.5708, 0) # pi/2 rotation
qc.rz(1.5708, 0) # Two pi/2 rotations = one pi rotation (reducible)
qc.h(1)
qc.cx(1, 2)
qc.h(1) # H CX H = CZ decomposed: tket can detect this pattern
print("Qiskit circuit:")
print(qc.draw())
print(f"Gate count (Qiskit): {qc.size()}")
print(f"Depth (Qiskit): {qc.depth()}")
# Convert to tket Circuit
tk_circuit = qiskit_to_tk(qc)
print(f"\ntket gate count (before opt): {tk_circuit.n_gates}")
print(f"tket depth (before opt): {tk_circuit.depth()}")
Applying Optimization Passes
tket passes are applied via pass.apply(circuit). Passes can be composed into a SequencePass for multi-stage optimization.
from pytket.passes import (
SequencePass,
FullPeepholeOptimise,
DecomposeBoxes,
RemoveRedundancies,
CommuteThroughMultis,
)
# Stage 1: Decompose any high-level box operations into primitive gates
# Stage 2: Remove redundant gate pairs (CX CX = I, etc.)
# Stage 3: Commute gates through each other to find more cancellations
# Stage 4: Full peephole optimization: scan windows and apply optimal replacements
optimisation_pass = SequencePass([
DecomposeBoxes(),
RemoveRedundancies(),
CommuteThroughMultis(),
FullPeepholeOptimise(),
RemoveRedundancies(), # Run again after peephole may have created new redundancies
])
# Apply the pass in-place
optimisation_pass.apply(tk_circuit)
print(f"\ntket gate count (after opt): {tk_circuit.n_gates}")
print(f"tket depth (after opt): {tk_circuit.depth()}")
# Convert back to Qiskit to compare
qc_optimised = tk_to_qiskit(tk_circuit)
print(f"\nQiskit gate count (after tket opt): {qc_optimised.size()}")
print(f"Qiskit depth (after tket opt): {qc_optimised.depth()}")
Typical results for circuits with redundant gates:
| Metric | Before | After |
|---|---|---|
| Gate count | 11 | 5 |
| Circuit depth | 9 | 4 |
| Two-qubit gates | 5 | 2 |
The repeated CX pairs are cancelled immediately by RemoveRedundancies. The back-to-back Rz rotations are merged into a single Rz. The H-CX-H pattern is recognized as a CZ and either left as-is or compiled to a shorter native sequence depending on the target backend.
Gate Count Breakdown
To understand what tket produces in detail, inspect the gate type histogram:
from pytket.circuit import OpType
def gate_count_by_type(circuit):
"""Return a dict of gate type to count."""
counts = {}
for cmd in circuit.get_commands():
op_str = str(cmd.op.type)
counts[op_str] = counts.get(op_str, 0) + 1
return counts
before_counts = gate_count_by_type(qiskit_to_tk(qc)) # Re-import original
after_counts = gate_count_by_type(tk_circuit)
print("Gate type breakdown before optimization:")
for gate, count in sorted(before_counts.items()):
print(f" {gate}: {count}")
print("\nGate type breakdown after optimization:")
for gate, count in sorted(after_counts.items()):
print(f" {gate}: {count}")
Backend-Specific Compilation for IBM Hardware
For real hardware, you also need to decompose gates into the backend’s native gate set and route qubits to respect the hardware connectivity. tket handles both:
from pytket.extensions.qiskit import IBMQBackend
# Connect to an IBM backend (requires IBMQ credentials)
# backend = IBMQBackend("ibm_brisbane")
# For offline testing, use the fake backend
from pytket.extensions.qiskit import IBMQEmulatorBackend
# backend = IBMQEmulatorBackend("ibm_brisbane")
# If you have a backend object:
# compiled_circuit = backend.get_compiled_circuit(tk_circuit, optimisation_level=2)
# optimisation_level 0: only routing, 1: light opt, 2: full opt
# Without a live backend, show the manual compilation pass for IBM's native gate set
# IBM native gates: CX, ID, RZ, SX, X
from pytket.passes import (
RebaseIBM,
FullPeepholeOptimise,
PlacementPass,
RoutingPass,
)
from pytket.placement import GraphPlacement
from pytket.architecture import Architecture
# Define a simple linear 4-qubit connectivity (like a subset of IBM devices)
arch = Architecture([(0, 1), (1, 2), (2, 3)])
# Build the IBM-targeted compilation sequence
ibm_pass = SequencePass([
DecomposeBoxes(),
FullPeepholeOptimise(),
PlacementPass(GraphPlacement(arch)),
RoutingPass(arch),
RebaseIBM(),
RemoveRedundancies(),
])
# Apply on a fresh copy of the original circuit
tk_for_ibm = qiskit_to_tk(qc)
ibm_pass.apply(tk_for_ibm)
print(f"Gate count (IBM compiled): {tk_for_ibm.n_gates}")
print(f"Depth (IBM compiled): {tk_for_ibm.depth()}")
print(f"CX count (IBM compiled): {tk_for_ibm.n_gates_of_type(OpType.CX)}")
Routing and SWAP Insertion
Real quantum hardware has limited qubit connectivity. For instance, IBM’s heavy-hex architecture only connects nearest-neighbor qubits. If your circuit requires a two-qubit gate between non-adjacent qubits, the compiler must insert SWAP gates to move qubit states along the device graph.
tket’s routing pass:
- Finds an initial qubit placement that minimizes the number of required SWAPs (placement).
- Inserts SWAP gates wherever the circuit requires non-adjacent two-qubit gates (routing).
- Applies further optimizations to cancel SWAPs introduced during routing (cleanup).
The GraphPlacement strategy places logical qubits onto physical qubits by solving a subgraph isomorphism between the circuit’s interaction graph and the device coupling map. For circuits with regular structure (such as linear chains or grids), this placement is near-optimal.
Each SWAP costs 3 CX gates in IBM’s gate set, so routing quality has a significant effect on circuit depth.
Roundtrip Execution
After tket optimization, convert back to Qiskit for execution on Qiskit-based simulators or IBM hardware:
from qiskit_aer import AerSimulator
from qiskit import transpile
# Convert tket circuit back to Qiskit
qc_from_tket = tk_to_qiskit(tk_circuit)
# Add measurements
qc_from_tket.measure_all()
# Run on Aer
sim = AerSimulator()
compiled = transpile(qc_from_tket, sim)
result = sim.run(compiled, shots=1024).result()
counts = result.get_counts()
print("Measurement results:", counts)
The roundtrip is lossless for standard gate sets. tket’s internal representation uses exact angle arithmetic, so no floating-point degradation occurs when converting Rz angles between frameworks.
Comparing tket vs Qiskit Transpiler
For general-purpose optimization of circuits that will run on IBM hardware:
| Feature | tket | Qiskit transpiler |
|---|---|---|
| Framework support | Qiskit, Cirq, PyQuil, more | Qiskit native |
| Peephole optimization | FullPeepholeOptimise | optimization_level=3 |
| Routing algorithms | GraphPlacement, LinePlacement | SABRE, basic |
| Custom pass composition | SequencePass, RepeatPass | PassManager |
| Native gate targets | IBM, IonQ, Quantinuum, more | IBM (primary) |
For circuits that will run on non-IBM hardware, tket is often the better choice because it has first-class support for hardware-native gate sets (like Quantinuum’s PHIR or IonQ’s native gates) without needing framework-specific transpilers.
For pure IBM workflows, Qiskit’s transpile with optimization_level=3 and tket’s FullPeepholeOptimise produce comparable results on most benchmarks. The difference tends to be largest for deep circuits with many two-qubit gates.
Summary
| Step | tket API |
|---|---|
| Import from Qiskit | qiskit_to_tk(qc) |
| Remove redundant gates | RemoveRedundancies().apply(c) |
| Full peephole optimization | FullPeepholeOptimise().apply(c) |
| Place qubits on hardware | PlacementPass(GraphPlacement(arch)).apply(c) |
| Route and insert SWAPs | RoutingPass(arch).apply(c) |
| Rebase to IBM gate set | RebaseIBM().apply(c) |
| Export to Qiskit | tk_to_qiskit(c) |
| Count gates by type | c.n_gates_of_type(OpType.CX) |
tket’s composable pass system makes it straightforward to mix and match optimization strategies and target different hardware backends from the same circuit definition. For teams running experiments across multiple hardware providers, this hardware-agnostic approach can significantly reduce the porting overhead.
Was this tutorial helpful?