Circuit Cutting and Knitting in Qiskit
Use Qiskit's circuit cutting addon (qiskit-addon-cutting) to split large quantum circuits across smaller QPUs using gate and wire cutting techniques.
Circuit diagrams
The Problem: Circuits Too Large for One QPU
NISQ devices have limited qubit counts and constrained connectivity. A circuit requiring 20 qubits cannot run on a 10-qubit device, and even when a device has enough qubits, heavy cross-chip routing can destroy fidelity. Circuit cutting solves this by decomposing a large circuit into smaller subcircuits that run independently on separate devices (or the same device in separate jobs). Classical post-processing then reconstructs the original circuit’s expectation values from the subcircuit results.
The core tradeoff is simple: you reduce quantum resource requirements at the cost of increased classical processing and more total shots. Each cut introduces a multiplicative overhead in the number of circuit executions needed. This makes circuit cutting practical only when the number of cuts is small, typically one to three.
Circuit cutting fits two main use cases:
- Distributing across QPUs. When no single QPU has enough qubits, you split the circuit across two or more devices. Each device runs its subcircuit independently, and you combine results classically.
- Reducing circuit depth. Subcircuits are shallower than the original circuit, which means less accumulated gate error on noisy hardware. Even when a single QPU has enough qubits, cutting can improve fidelity if the depth reduction outweighs the reconstruction overhead.
Wire Cutting vs Gate Cutting: Physical Intuition
Two distinct techniques exist for cutting circuits. Understanding the physical mechanism behind each one clarifies when to use which.
Wire Cutting
Wire cutting inserts a classical communication channel across a quantum wire. Imagine a circuit where qubit 1 in partition A connects to qubit 2 in partition B through a wire. Wire cutting replaces that quantum connection with a measure-communicate-prepare protocol:
- Measure the state on the sender qubit (in partition A) in one of several Pauli bases (X, Y, or Z).
- Classically communicate the measurement outcome.
- Prepare a corresponding state on the receiver qubit (in partition B) based on the measurement result.
Because a single measurement in one basis cannot capture all the information in a quantum state (quantum states live in a continuous space, but measurements yield discrete outcomes), you must repeat this process across multiple basis choices and average the results. Specifically, you need six different preparation/measurement configurations per wire cut.
The key insight is that no quantum communication occurs between partitions. Everything flows through classical channels. The price you pay is that you need many more circuit executions to reconstruct the same statistical precision.
Gate Cutting
Gate cutting takes a different approach. Instead of cutting a wire, you decompose a two-qubit gate (like a CNOT) into a weighted sum of local single-qubit operations, one set acting on partition A and another on partition B. The decomposition looks like this:
CNOT(q1, q2) = sum_i c_i * [U_i(q1)] ⊗ [V_i(q2)]
Each term in the sum is a pair of single-qubit operations that can be executed independently on separate partitions. The coefficients c_i can be negative (quasi-probabilities), which is why reconstruction requires careful weighted combination of results.
Gate cutting does not require any physical modification to the circuit’s wiring. It replaces a single two-qubit gate with multiple single-qubit experiments. Like wire cutting, it uses only classical post-processing to combine results.
When to Use Which
Gate cutting is generally preferred when you have a specific two-qubit gate crossing the partition boundary, because its overhead per cut is lower than wire cutting for common gates like CNOT. Wire cutting is useful when the circuit has a natural wire boundary (a qubit whose state flows from one partition to another) or when you need to split the circuit at a point that does not correspond to a single gate.
Quasi-Probability Decomposition: The Math Behind Cutting
The mathematical foundation of circuit cutting is the quasi-probability decomposition (QPD). Understanding QPD explains where the overhead comes from and why it scales the way it does.
The Core Idea
Consider a single wire cut. The identity channel on one qubit (the operation “do nothing, just pass the quantum state through”) can be decomposed as:
I = (1/2) * sum_{P in {|0⟩,|1⟩,|+⟩,|-⟩,|+i⟩,|-i⟩}} c_P * (prepare P) ⊗ (measure in basis of P)
This sum has six terms, corresponding to preparations and measurements in the X, Y, and Z eigenbases. The coefficients c_P are quasi-probabilities: they sum to 1, but some are negative. You cannot interpret them as classical probabilities, but you can use them as weights when combining measurement results.
The Overhead Factor: Gamma
The total variation of the quasi-probability distribution defines the overhead factor gamma:
gamma = sum_i |c_i|
For a single wire cut, gamma = 3. The number of additional shots required to achieve the same statistical precision as the uncut circuit scales as gamma squared:
shots_needed = gamma^2 * target_shots = 9 * target_shots
This means a single wire cut requires 9 times more shots. Intuitively, the negative coefficients introduce sign cancellations during reconstruction, which increases the variance of the estimator. You need more samples to beat down that extra variance.
For k independent cuts, the overhead multiplies:
total_gamma^2 = (gamma_1^2) * (gamma_2^2) * ... * (gamma_k^2)
Gate Cutting Overhead vs Wire Cutting Overhead
Different gates have different QPD decompositions with different gamma values. Here is a comparison:
| Cut Type | Terms | gamma | gamma^2 (shot overhead) |
|---|---|---|---|
| Wire cut | 6 | 3 | 9x |
| CNOT gate cut | 6 | ~2.45 | 6x |
| CZ gate cut | 6 | ~2.45 | 6x |
| SWAP gate cut | 18 | ~4.24 | 18x |
| RZZ(theta) gate cut | varies | depends on theta | varies |
The practical implication: when a CNOT crosses the partition boundary, gate cutting (6x overhead) is cheaper than wire cutting (9x overhead). Prefer gate cutting when the boundary corresponds to a specific two-qubit gate.
For multiple cuts, the overhead compounds. Two CNOT gate cuts cost 6^2 = 36x shots. Two wire cuts cost 9^2 = 81x shots. Three wire cuts cost 9^3 = 729x shots. This exponential scaling is the fundamental limitation of circuit cutting.
Setup
pip install qiskit-addon-cutting qiskit qiskit-aer
All code in this tutorial requires the Qiskit Circuit Cutting addon. Install it separately from qiskit and qiskit-aer. The addon provides the partition_problem, generate_cutting_experiments, and reconstruct_expectation_values functions that form the core cutting workflow.
Understanding Partition Labels
Before diving into code, let’s clarify how partition labels work. The partition label string maps each qubit index to a named partition. For a 4-qubit circuit with labels "AABB":
- Qubit 0 maps to partition A
- Qubit 1 maps to partition A
- Qubit 2 maps to partition B
- Qubit 3 maps to partition B
Any two-qubit gate that acts across the A-B boundary (for example, cx(1, 2)) gets automatically identified and cut by partition_problem. Gates acting within a single partition (like cx(0, 1) within A or cx(2, 3) within B) remain intact.
Choosing Good Partition Labels
The quality of your partition choice directly affects performance. The goal is to minimize the number of gates crossing partition boundaries, because each cross-boundary gate becomes a cut with multiplicative overhead.
To identify the best partition boundary:
- List all two-qubit gates in your circuit.
- For each possible partition boundary, count how many two-qubit gates cross it.
- Choose the boundary with the fewest crossings.
For example, in a linear chain of CNOT gates on qubits 0-1-2-3, the gate pattern is: cx(0,1), cx(1,2), cx(2,3). Partitioning as "AABB" puts one gate (cx(1,2)) on the boundary. Partitioning as "ABBA" would put two gates on boundaries. The first choice is better.
Example: Gate Cutting a 4-Qubit Circuit
This complete example creates a 4-qubit circuit, cuts it into two 2-qubit subcircuits using gate cutting, runs the subcircuits, reconstructs the expectation values, and verifies the results against the uncut circuit.
Step 1: Build the Circuit and Define Observables
import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
# Build a 4-qubit circuit with one cross-partition CX gate
circuit = QuantumCircuit(4)
circuit.h(0)
circuit.h(1)
circuit.cx(0, 1) # Within partition A
circuit.cx(1, 2) # Crosses A-B boundary: this gate will be cut
circuit.cx(2, 3) # Within partition B
circuit.ry(0.4, 0)
circuit.ry(0.8, 1)
circuit.ry(1.2, 2)
circuit.ry(1.6, 3)
print("Original circuit:")
print(circuit.draw(output="text"))
# Define observables to measure
observable = SparsePauliOp(["ZZII", "IZZI", "IIZZ"])
Step 2: Partition the Problem
from qiskit_addon_cutting import (
partition_problem,
generate_cutting_experiments,
reconstruct_expectation_values,
)
# Partition: qubits 0,1 -> A, qubits 2,3 -> B
# The cx(1,2) gate crosses the boundary and will be decomposed
partitioned_problem = partition_problem(
circuit=circuit,
partition_labels="AABB",
observables=observable.paulis,
)
subcircuits = partitioned_problem.subcircuits
subobservables = partitioned_problem.subobservables
bases = partitioned_problem.bases
print(f"Number of subcircuits: {len(subcircuits)}")
for label, subcirc in subcircuits.items():
print(f"\nPartition {label} ({subcirc.num_qubits} qubits):")
print(subcirc.draw(output="text"))
The partition_problem function returns a PartitionedCuttingProblem named tuple with three fields:
subcircuits: a dictionary mapping partition labels to their quantum circuitssubobservables: a dictionary mapping partition labels to the local observables each partition must measurebases: the QPD bases used for the decomposition
Step 3: Generate Subcircuit Experiments
# Generate all subcircuit experiments needed for reconstruction
# num_samples=np.inf means exact decomposition (all basis combinations)
subexperiments, coefficients = generate_cutting_experiments(
circuits=subcircuits,
observables=subobservables,
num_samples=np.inf,
)
# Count the experiments
for label, expts in subexperiments.items():
print(f"Partition {label}: {len(expts)} subcircuit experiments")
print(f"Total experiments: {sum(len(e) for e in subexperiments.values())}")
With num_samples=np.inf, the function generates all possible basis combinations for the QPD. This gives exact reconstruction (up to shot noise) but produces the maximum number of subcircuit experiments. For production use on real hardware with many cuts, you can set num_samples to a finite integer to stochastically sample a subset of basis combinations, trading reconstruction accuracy for fewer experiments.
Step 4: Run Subcircuit Experiments
from qiskit_aer.primitives import SamplerV2
# Create a sampler for running subcircuit experiments
sampler = SamplerV2()
# Run each partition's experiments
# Each partition's experiments are independent and can be submitted as a batch
results = {
label: sampler.run(subsystem_subexpts, shots=4096).result()
for label, subsystem_subexpts in subexperiments.items()
}
Step 5: Reconstruct Expectation Values
# Reconstruct the full expectation values from subcircuit results
reconstructed_expval_terms = reconstruct_expectation_values(
results,
coefficients,
subobservables,
)
# Combine terms weighted by observable coefficients
reconstructed_expval = np.dot(reconstructed_expval_terms, observable.coeffs)
print(f"Reconstructed expectation value: {np.real(reconstructed_expval):.6f}")
Step 6: Verify Against the Full Circuit
from qiskit_aer.primitives import EstimatorV2
# Run the original uncut circuit for comparison
estimator = EstimatorV2()
exact_result = estimator.run([(circuit, observable)]).result()
exact_expval = exact_result[0].data.evs
print(f"Exact expectation value: {exact_expval:.6f}")
print(f"Reconstructed expectation value: {np.real(reconstructed_expval):.6f}")
print(f"Absolute error: {abs(np.real(reconstructed_expval) - exact_expval):.6f}")
For an ideal (noiseless) simulation with sufficient shots, the reconstructed value should match the exact value to within shot noise. If you see large deviations, check that your partition labels correctly identify the cross-boundary gates.
Wire Cutting with the Move Instruction
Wire cutting uses a different mechanism: the Move instruction. This instruction represents the physical operation of transferring a qubit’s state from one register to another, which the cutting toolbox decomposes into measure-prepare pairs during experiment generation.
from qiskit_addon_cutting.instructions import Move
# Create a circuit that explicitly marks a wire cut with Move
qc_wire = QuantumCircuit(5) # 5 qubits: 3 in partition A, 2 in partition B
qc_wire.h(0)
qc_wire.cx(0, 1)
qc_wire.cx(1, 2)
# Move qubit 2's state to qubit 3 (crossing the partition boundary)
qc_wire.append(Move(), [2, 3])
qc_wire.cx(3, 4)
print("Circuit with Move instruction:")
print(qc_wire.draw(output="text"))
# Partition: qubits 0,1,2 -> A, qubits 3,4 -> B
wire_partitioned = partition_problem(
circuit=qc_wire,
partition_labels="AAABB",
observables=SparsePauliOp("IZZII").paulis,
)
The Move instruction tells the cutting toolbox exactly where to insert the wire cut. During experiment generation, it gets replaced by the six measure-prepare pairs needed for the QPD of the identity channel.
Multiple Cuts: Scaling to Three Partitions
When a circuit needs to be split into more than two pieces, you use multiple cuts. Each additional cut multiplies the shot overhead.
Example: 6-Qubit Circuit with Two Cuts
# A 6-qubit circuit split into three 2-qubit partitions
circuit_6q = QuantumCircuit(6)
circuit_6q.h(range(6))
circuit_6q.cx(0, 1) # Within partition A
circuit_6q.cx(1, 2) # Cut 1: crosses A-B boundary
circuit_6q.cx(2, 3) # Within partition B
circuit_6q.cx(3, 4) # Cut 2: crosses B-C boundary
circuit_6q.cx(4, 5) # Within partition C
circuit_6q.ry(0.3, range(6))
# Three partitions: A (qubits 0,1), B (qubits 2,3), C (qubits 4,5)
partitioned_6q = partition_problem(
circuit=circuit_6q,
partition_labels="AABBCC",
observables=SparsePauliOp(["ZZIIII", "IIZZII", "IIIIZZ"]).paulis,
)
print(f"Partitions: {list(partitioned_6q.subcircuits.keys())}")
for label, subcirc in partitioned_6q.subcircuits.items():
print(f" Partition {label}: {subcirc.num_qubits} qubits")
Overhead Analysis for Multiple Cuts
With two CNOT gate cuts, the total shot overhead is:
total_overhead = gamma_1^2 * gamma_2^2 = 6 * 6 = 36x
With two wire cuts, the overhead is:
total_overhead = 9 * 9 = 81x
Here is how the overhead scales with the number of cuts:
| Number of wire cuts | Shot overhead | Practical? |
|---|---|---|
| 1 | 9x | Yes |
| 2 | 81x | Yes, with sufficient shot budget |
| 3 | 729x | Marginal, requires large shot budget |
| 4 | 6,561x | Rarely practical |
| 5 | 59,049x | Impractical for most applications |
The rule of thumb: budget for at most 2-3 cuts. Beyond that, the exponential overhead makes reconstruction impractically noisy unless you have access to very large shot budgets (millions of shots per subcircuit experiment).
Subcircuit Parallelization
One of the primary benefits of circuit cutting is that subcircuits can run in parallel on separate QPUs or separate simulator instances. The subcircuit experiments for partition A and partition B are completely independent.
Parallel Execution on Simulators
from concurrent.futures import ThreadPoolExecutor
sampler = SamplerV2()
def run_partition(label_and_experiments):
label, experiments = label_and_experiments
result = sampler.run(experiments, shots=4096).result()
return label, result
# Run all partitions in parallel
with ThreadPoolExecutor(max_workers=len(subexperiments)) as executor:
futures = executor.map(run_partition, subexperiments.items())
parallel_results = dict(futures)
Parallel Execution on Multiple QPUs
On real quantum hardware, you submit each partition’s experiments to a different backend:
from qiskit_ibm_runtime import SamplerV2, QiskitRuntimeService
service = QiskitRuntimeService()
# Assign each partition to a different backend
backend_assignments = {
"A": service.backend("ibm_brisbane"),
"B": service.backend("ibm_kyoto"),
}
# Submit jobs to separate backends
jobs = {}
for label, experiments in subexperiments.items():
backend = backend_assignments[label]
sampler = SamplerV2(mode=backend)
jobs[label] = sampler.run(experiments, shots=4096)
# Collect results (jobs run concurrently on different hardware)
results = {label: job.result() for label, job in jobs.items()}
This parallel execution reduces wall-clock time roughly by the number of partitions, which can offset some of the shot overhead from cutting.
Classical Reconstruction Overhead
Reconstruction is a purely classical post-processing step that combines subcircuit results using the quasi-probability coefficients. Its computational cost scales as:
classical_operations = O(N_configs * M)
where N_configs is the number of distinct subcircuit configurations (basis combinations) and M is the number of observable terms.
For k cuts using exact decomposition (num_samples=np.inf):
| Cuts (k) | Configs per cut | Total configs | With 5 observables |
|---|---|---|---|
| 1 | 6 | 6 | 30 operations |
| 2 | 6 | 36 | 180 operations |
| 3 | 6 | 216 | 1,080 operations |
| 4 | 6 | 1,296 | 6,480 operations |
Even for 4 cuts, the classical reconstruction takes microseconds on a modern CPU. The classical overhead is never the bottleneck. The quantum shot overhead (running 6^k times more circuit executions) is the real cost.
Hardware Noise and Circuit Cutting Interaction
On real hardware, circuit cutting creates an interesting tension between two opposing effects.
The Benefit: Shorter Subcircuits
Subcircuits have fewer qubits and lower depth than the original circuit. Fewer two-qubit gates means less accumulated decoherence and gate error. If the original circuit has 80 CNOT gates and the subcircuits each have 35, the per-subcircuit error rate is significantly lower.
The Cost: Noise Amplification During Reconstruction
The quasi-probability coefficients used in reconstruction include negative values. When you multiply noisy results by these coefficients and sum them, the noise gets amplified. Specifically, the variance of the reconstructed expectation value scales as gamma^2 times the per-subcircuit variance. This is the same factor that requires more shots in the noiseless case, but with hardware noise, the effect compounds: you need even more shots to overcome both the QPD variance and the hardware noise.
The Net Effect
Whether circuit cutting helps on noisy hardware depends on the balance:
- Circuit cutting helps when the original circuit is deep (many two-qubit gates), the cross-partition entanglement is weak (few cuts needed), and the hardware has moderate gate error rates. A rule of thumb: cutting is beneficial when the original circuit has more than 50 two-qubit gates and requires only 1-2 cuts.
- Circuit cutting hurts when the original circuit is shallow (few two-qubit gates) or requires many cuts. In these cases, the noise amplification from reconstruction outweighs the benefit of shorter subcircuits.
Practical Guideline
Before committing to circuit cutting on real hardware, run a noise simulation comparing:
- The full circuit on a noisy simulator matching your target backend.
- The cut-and-reconstructed result on the same noisy simulator.
If the cut version has lower error, proceed. If not, consider whether a larger backend or error mitigation alone would be more effective.
Combining Circuit Cutting with Error Mitigation
Circuit cutting works alongside Qiskit’s error mitigation primitives, but the order of operations matters.
Correct Approach: Mitigate Per Subcircuit, Then Reconstruct
Apply error mitigation (such as zero-noise extrapolation, ZNE) to each subcircuit independently, then feed the mitigated results into reconstruct_expectation_values. This is correct because each subcircuit is a self-contained circuit with its own noise profile.
from qiskit_ibm_runtime import EstimatorV2, Options
# Configure ZNE for subcircuit execution
options = Options()
options.resilience_level = 2 # Enables ZNE
# Use the resilience-enabled Estimator for each subcircuit
# The mitigated results then go into reconstruct_expectation_values
Incorrect Approach: Mitigate After Reconstruction
Do not apply ZNE to the reconstructed expectation value. The reconstructed value is a weighted linear combination of subcircuit results, and applying noise extrapolation to this combination does not correctly account for the quasi-probability structure. The result would be meaningless.
Entanglement Forging: A Related Technique
Entanglement forging is a specialized circuit cutting technique for circuits where the quantum state can be written as a Schmidt decomposition with a small number of terms:
|psi⟩ = sum_i lambda_i * |phi_i⟩_A ⊗ |chi_i⟩_B
When the Schmidt rank is low (few terms in the sum), entanglement forging can reconstruct expectation values with lower overhead than general circuit cutting. Instead of decomposing arbitrary gates, it exploits the product-state structure directly.
The tradeoff: entanglement forging requires the circuit to produce a state with this specific structure, which limits its applicability. For circuits with high entanglement across the partition boundary, general gate or wire cutting is the only option.
The qiskit-addon-cutting package includes entanglement forging support. See the addon documentation for implementation details and examples.
Verification Workflow: Comparing Cut vs Uncut Results
Always verify your cutting implementation before deploying on real hardware. The procedure is straightforward:
import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
from qiskit_aer.primitives import SamplerV2, EstimatorV2
from qiskit_addon_cutting import (
partition_problem,
generate_cutting_experiments,
reconstruct_expectation_values,
)
# 1. Build circuit and observable
qc = QuantumCircuit(4)
qc.h(0)
qc.cx(0, 1)
qc.cx(1, 2)
qc.cx(2, 3)
qc.rz(0.7, range(4))
observable = SparsePauliOp(["ZZII", "IIZZ", "ZIZI"])
# 2. Run the full circuit (reference)
estimator = EstimatorV2()
exact_expval = estimator.run([(qc, observable)]).result()[0].data.evs
print(f"Exact expectation value: {exact_expval:.6f}")
# 3. Run the cut circuit
partitioned = partition_problem(
circuit=qc,
partition_labels="AABB",
observables=observable.paulis,
)
subexperiments, coefficients = generate_cutting_experiments(
circuits=partitioned.subcircuits,
observables=partitioned.subobservables,
num_samples=np.inf,
)
sampler = SamplerV2()
results = {
label: sampler.run(expts, shots=8192).result()
for label, expts in subexperiments.items()
}
reconstructed_terms = reconstruct_expectation_values(
results, coefficients, partitioned.subobservables
)
reconstructed_expval = np.dot(reconstructed_terms, observable.coeffs)
# 4. Compare
print(f"Reconstructed: {np.real(reconstructed_expval):.6f}")
print(f"Exact: {exact_expval:.6f}")
deviation = abs(np.real(reconstructed_expval) - exact_expval)
print(f"Deviation: {deviation:.6f}")
# For ideal simulation with 8192 shots, deviation should be < 0.05
assert deviation < 0.1, f"Deviation too large: {deviation}"
print("Verification passed.")
If verification fails, check:
- That your partition labels correctly capture the cross-boundary gates
- That you passed
observables=observable.paulis(aPauliList) topartition_problem, not theSparsePauliOpitself - That all subcircuit experiments completed successfully
- That your shot count is sufficient (increase shots to reduce statistical noise)
Common Mistakes
1. Cutting High-Entanglement Boundaries
Placing cuts on boundaries with many crossing gates does not reduce quantum resource requirements efficiently. If four CNOT gates cross the A-B boundary, you need four cuts, giving an overhead of 6^4 = 1,296x for gate cuts or 9^4 = 6,561x for wire cuts. Instead, rearrange your partition labels to minimize cross-boundary gates, or restructure the circuit so that entanglement across the boundary is concentrated in fewer gates.
2. Using num_samples=np.inf on Real Hardware
Setting num_samples=np.inf generates all possible basis combinations for exact QPD. This is correct for simulation-based verification, but on real hardware with finite shot budgets, you often want to set num_samples to a finite integer. The function then stochastically samples the most important basis combinations according to the quasi-probability distribution, reducing the number of subcircuit experiments at the cost of some reconstruction accuracy.
# For simulation/verification: exact decomposition
subexperiments, coefficients = generate_cutting_experiments(
circuits=subcircuits,
observables=subobservables,
num_samples=np.inf,
)
# For real hardware: stochastic sampling with fewer experiments
subexperiments, coefficients = generate_cutting_experiments(
circuits=subcircuits,
observables=subobservables,
num_samples=1000, # Sample 1000 basis combinations
)
3. Underestimating the Shot Budget
The overhead from cutting is multiplicative. If you want 4,096 effective shots for your final expectation value and you have one wire cut:
required_shots_per_experiment = 9 * 4096 = 36,864
For two wire cuts:
required_shots_per_experiment = 81 * 4096 = 331,776
Forgetting to scale the shot count by gamma^2 gives a reconstructed expectation value with much higher variance than expected. The result may look wrong, but it is simply undersampled.
4. Applying Error Mitigation at the Wrong Level
As discussed in the error mitigation section: apply ZNE or other mitigation techniques to each subcircuit experiment individually, before passing results to reconstruct_expectation_values. Applying mitigation after reconstruction produces incorrect results because the quasi-probability weighting interacts nonlinearly with the extrapolation.
When to Use Circuit Cutting (Decision Framework)
Use circuit cutting when:
- Your circuit exceeds the qubit count of any available QPU.
- Your circuit’s cross-partition entanglement is low (1-2 gates cross the boundary).
- You have access to multiple QPUs and want to parallelize.
- The depth reduction from cutting significantly improves subcircuit fidelity.
Avoid circuit cutting when:
- A QPU with enough qubits is available and the circuit depth is manageable.
- The circuit requires more than 3 cuts (overhead becomes impractical).
- The circuit has dense cross-partition entanglement with no clean boundary.
Always benchmark first: run both the full circuit and the cut version on a noisy simulator, then compare accuracy. Circuit cutting is a tool, not a universal improvement.
Key Points
- Circuit cutting splits large circuits into smaller subcircuits using gate cutting or wire cutting, both relying on quasi-probability decomposition.
- Gate cutting a CNOT costs 6x shots per cut. Wire cutting costs 9x shots per cut. Overhead compounds exponentially with the number of cuts.
- Choose partition boundaries that minimize cross-boundary two-qubit gates.
- Subcircuits run independently and can be parallelized across QPUs.
- Apply error mitigation per subcircuit before reconstruction.
- Verify your implementation by comparing cut vs uncut results on a noiseless simulator.
- For circuits with low Schmidt rank across the partition boundary, entanglement forging offers lower overhead than general cutting.
- Practical limit: 1-3 cuts. Beyond that, the shot overhead dominates any benefit.
Was this tutorial helpful?