Getting Started with Superstaq

Superstaq is a cross-platform quantum compiler and optimizer from Infleqtion (formerly ColdQuanta). Instead of writing separate code for each hardware provider, you write one circuit and Superstaq compiles it to the native gate set of whatever backend you target. It supports IonQ, Quantinuum, IBM, Rigetti, AQT, and Infleqtion’s own neutral atom hardware.

This tutorial walks through installation, authentication, running a Bell state on the IonQ simulator, and then submitting the same circuit to Quantinuum’s emulator to see what hardware-aware compilation actually does. Along the way, you will learn how to use Superstaq with both Qiskit and Cirq, compare compiled circuits across multiple backends, build more complex circuits like GHZ states, and develop a strategy for choosing the right hardware for your workload.

Why Cross-Platform Compilation Matters

Quantum hardware is fragmented. Each manufacturer builds qubits using different physics, and each platform exposes a different set of native gates. If you write a circuit using standard textbook gates (H, CNOT, T, S), the hardware cannot execute those gates directly. Every circuit must be decomposed into the specific gate set that the target hardware supports.

Here is a concrete look at the problem. Four major quantum hardware platforms each use completely different native gate sets:

IonQ uses trapped ytterbium ions. The native gates are:

GPi(phi): a single-qubit rotation around an axis in the XY plane, parameterized by the angle phi
GPi2(phi): a half-rotation (pi/2) around the same axis
MS: the Molmer-Sorensen gate, which is IonQ’s native two-qubit entangling operation

Quantinuum uses trapped ytterbium ions (with barium ions for sympathetic cooling). The native gates are:

Rz(theta): rotation around the Z axis
PhasedX(theta, phi) (also called U1q): a rotation in the XY plane with a phase
ZZPhase(theta): the native two-qubit interaction, which applies a ZZ rotation

IBM uses superconducting transmon qubits. The native gates are:

RZ(theta): Z rotation (implemented virtually, zero error)
SX: the square root of X (a pi/2 rotation around X)
X: a pi rotation around X
CZ: the controlled-Z gate, the native two-qubit operation on IBM’s current Heron processors (the retired Eagle generation used the ECR gate instead)

Rigetti also uses superconducting qubits, but with a different gate set:

RZ(theta): Z rotation
RX(theta): X rotation
CZ: the controlled-Z gate, Rigetti’s native two-qubit operation

Consider a simple Bell state circuit. In textbook notation it requires just two gates: a Hadamard (H) and a CNOT. But none of these platforms has both H and CNOT as native operations. The same logical circuit looks very different once compiled for each target:

On IonQ, the H gate becomes a sequence of GPi and GPi2 rotations, and the CNOT becomes a combination of GPi2 gates wrapped around an MS gate. On IBM, the H decomposes into RZ and SX gates, while the CNOT becomes a CZ gate with surrounding single-qubit rotations. On Quantinuum, the H becomes Rz and PhasedX operations, and the CNOT becomes a ZZPhase gate flanked by single-qubit rotations. On Rigetti, the H decomposes into RZ and RX, and the CNOT must be rewritten as a CZ gate with surrounding rotations.

Without a cross-platform compiler, you would need to:

Learn each platform’s native gate set
Write (or find) decomposition rules for every standard gate into every native gate set
Handle qubit connectivity constraints (superconducting chips have limited connectivity, requiring SWAP gates)
Stay current as calibration data changes and gate sets evolve

Superstaq handles all of this. You write one circuit using standard gates, pick a target backend, and Superstaq produces an optimized circuit using the correct native gates with routing and scheduling tailored to that specific hardware.

Superstaq Architecture

Understanding what happens inside Superstaq helps you use it more effectively. The compilation pipeline has four main stages:

1. Circuit ingestion. Superstaq accepts circuits in Qiskit (QuantumCircuit) or Cirq (cirq.Circuit) format. You do not need to learn a new circuit description language. The provider classes (qiskit_superstaq.SuperstaqProvider and cirq_superstaq.Service) handle serialization and transmission to the Superstaq cloud service.

2. Internal intermediate representation (IR). The input circuit is converted to Superstaq’s internal IR. This is a hardware-agnostic representation that preserves the logical structure of your circuit while enabling analysis and transformation.

3. Hardware-specific optimization passes. This is where Superstaq adds the most value. The optimizer applies multiple passes:

Gate fusion: adjacent single-qubit gates are combined into a single rotation, reducing gate count and circuit depth.
Native gate decomposition: each gate is rewritten using the target’s native gate set. Superstaq uses mathematically optimal (or near-optimal) decompositions rather than generic textbook formulas.
Qubit routing: for hardware with limited connectivity (like IBM’s heavy-hex topology or Rigetti’s lattice), Superstaq inserts SWAP gates to move qubit states to physically adjacent positions. It uses calibration data to choose routes through the highest-fidelity connections.
Scheduling: gates are ordered to minimize idle time and reduce decoherence. On platforms where T1 and T2 times vary across qubits, Superstaq accounts for this.

4. Output. The optimized circuit is returned in the same format you submitted (Qiskit or Cirq). If you submitted the circuit for execution, Superstaq forwards it to the target hardware’s API and returns a job handle.

A key differentiator is that Superstaq uses live calibration data from each backend. Hardware characteristics drift over time: gate error rates change, certain qubit pairs become noisier, and T1/T2 times fluctuate. Superstaq pulls recent calibration snapshots and factors them into routing and scheduling decisions. This means a circuit compiled today may differ from the same circuit compiled next week, because the optimizer adapts to current hardware conditions.

What Superstaq Does Differently

Most quantum SDKs give you a hardware-agnostic circuit model, then each provider handles compilation separately. The result is often mediocre: generic decompositions that work but are not optimal for the target hardware.

Superstaq sits between your circuit and the hardware. It knows the native gate sets, connectivity, and calibration characteristics of each backend, and it uses that knowledge to compile circuits that run faster and with fewer errors than a generic compiler would produce.

The Qiskit and Cirq integrations mean you do not have to learn a new circuit language. You write circuits the way you already do, then use Superstaq to target and optimize.

Installation

pip install qiskit-superstaq

If you prefer working in Cirq:

pip install cirq-superstaq

You can install both side by side if you want to use both frameworks:

pip install qiskit-superstaq cirq-superstaq

Each package pulls in its framework (Qiskit or Cirq) and the shared general-superstaq client as dependencies.

This tutorial uses the Qiskit integration for most examples and includes a dedicated Cirq section so you can see both workflows.

Getting an API Key

Sign up at superstaq.infleqtion.com to get an API key. Once you have it, set it as an environment variable so you do not hard-code credentials in your scripts:

export SUPERSTAQ_API_KEY="your-api-key-here"

Or set it in Python before importing the provider:

import os
os.environ["SUPERSTAQ_API_KEY"] = "your-api-key-here"

For persistent use, add the export line to your shell profile (~/.bashrc, ~/.zshrc, or similar) so it is available in every terminal session.

Creating a Provider

import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()

# See all available backends
backends = provider.backends()
for b in backends:
    print(b.name())

You will see a list that includes IonQ hardware and simulators, Quantinuum hardware and emulators, IBM backends, Rigetti, AQT, and Infleqtion’s own neutral atom machines.

Understanding Native Gate Sets

Before running circuits through Superstaq, it helps to understand what native gates look like on each platform. This section shows how the same Bell state (the simplest entangled circuit) is represented in each hardware’s native language.

IonQ Native Gates

IonQ trapped-ion systems use three native gates:

GPi(phi): rotates a single qubit by pi radians around an axis in the XY plane at angle phi. Mathematically: GPi(phi) = [[0, e^(-i*phi)], [e^(i*phi), 0]].
GPi2(phi): rotates a single qubit by pi/2 radians around the same axis. This is the “half rotation” gate.
MS: the Molmer-Sorensen gate, a maximally entangling two-qubit gate native to trapped-ion systems. It creates entanglement by coupling the ions’ motional and electronic states using laser pulses.

A Bell state on IonQ looks approximately like this in native gates (the exact decomposition depends on calibration):

q_0: ──GPi2(0)───●MS●──GPi2(π/2)──
                  │
q_1: ─────────── ●MS●──GPi2(π/2)──

The Hadamard becomes a GPi2 rotation, and the CNOT is rewritten as an MS gate surrounded by single-qubit corrections.

Quantinuum Native Gates

Quantinuum’s trapped-ion (ytterbium) systems use:

Rz(theta): rotation around the Z axis by angle theta.
PhasedX(theta, phi) (U1q): a single-qubit rotation that combines X and Y rotations with a phase parameter.
ZZPhase(theta): the native two-qubit interaction that applies exp(-i * theta * Z⊗Z). This is a continuous-angle gate, meaning theta can take any value.

A Bell state on Quantinuum uses a ZZPhase gate for entanglement, with Rz and PhasedX gates handling the single-qubit portion.

IBM Native Gates

IBM superconducting transmon systems use:

RZ(theta): virtual Z rotation (applied by adjusting the phase of subsequent pulses, so it has zero error).
SX: the square root of X gate, a pi/2 rotation around the X axis.
X: a pi rotation around X.
CZ: the controlled-Z gate, the native two-qubit entangling operation on IBM’s current Heron processors. (The retired Eagle generation used ECR; CNOT has not been a native gate on IBM hardware for several generations.)

A Bell state on IBM requires rewriting the CNOT as a CZ with Hadamard-equivalent rotations on the target:

q_0: ──RZ(π/2)──SX──RZ(π/2)──●──
                              │
q_1: ──RZ(π/2)──SX──RZ(π/2)──CZ──RZ(π/2)──SX──RZ(π/2)──

The Hadamard decomposes into RZ(π/2) - SX - RZ(π/2), and the CNOT becomes a CZ sandwiched between those Hadamard-equivalent sequences.

Rigetti Native Gates

Rigetti superconducting systems use:

RZ(theta): Z rotation.
RX(theta): X rotation (Rigetti supports continuous RX, unlike IBM’s fixed SX).
CZ: the controlled-Z gate, Rigetti’s native two-qubit operation.

A Bell state on Rigetti requires converting the CNOT into a CZ with surrounding Hadamard-equivalents:

q_0: ──RZ(π/2)──RX(π/2)──RZ(π/2)──●───
                                    │
q_1: ──RZ(π/2)──RX(π/2)──RZ(π/2)──CZ──RZ(π/2)──RX(π/2)──RZ(π/2)──

The point is clear: the same two-gate textbook circuit becomes a very different sequence of physical operations on each platform. Superstaq handles all of these translations automatically.

Your First Circuit: Bell State

Build a Bell state using standard Qiskit. Nothing Superstaq-specific yet.

from qiskit import QuantumCircuit

qc = QuantumCircuit(2, 2)
qc.h(0)         # Hadamard on qubit 0
qc.cx(0, 1)     # CNOT: entangle qubits 0 and 1
qc.measure([0, 1], [0, 1])

print(qc.draw())
#      ┌───┐      ░ ┌─┐
# q_0: ┤ H ├──■──░─┤M├───
#      └───┘┌─┴─┐░ └─┘┌─┐
# q_1: ─────┤ X ├░────┤M├
#           └───┘░    └─┘

This is a plain Qiskit circuit. You can simulate it locally with Qiskit’s AerSimulator if you want, but the goal here is to run it on real (or simulated) hardware via Superstaq.

Submitting to the IonQ Simulator

import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()
backend = provider.get_backend("ionq_ion_simulator")

job = backend.run(qc, shots=1024)
print("Job ID:", job.job_id())

# Block until the job completes
result = job.result()
counts = result.get_counts()
print(counts)
# {'00': 512, '11': 512}  -- approximate 50/50 split

Superstaq automatically compiles the circuit for IonQ before sending it. IonQ’s native gate set does not include H or CX directly. Superstaq rewrites those into GPi, GPi2, and MS gates (IonQ’s actual native operations) before submission.

Checking What Superstaq Compiled

You can inspect the compiled circuit before submitting. This is one of the most useful features for understanding what Superstaq actually does.

# Compile only -- do not submit yet
compiled_result = provider.compile(qc, target="ionq_ion_simulator")

print(compiled_result.circuit.draw())

The output will show native IonQ gates instead of H and CX. The exact decomposition depends on the backend’s calibration data. On IonQ hardware, the Molmer-Sorensen (MS) gate is the native two-qubit operation, so every CX gets rewritten as a sequence involving MS.

# Count gates before and after
print("Original CX count:", qc.count_ops().get('cx', 0))
print("Compiled 2Q gate count:", compiled_result.circuit.count_ops())

For a Bell state the gate count stays roughly the same (there is not much to optimize in a two-gate circuit). On larger circuits the differences become significant.

Submitting the Same Circuit to Quantinuum

This is where cross-platform value becomes clear. Switch the backend string, and Superstaq handles the rest. The Quantinuum native gate set is completely different from IonQ’s, but your circuit does not change.

# Target the Quantinuum H2-1 emulator (no hardware access required)
quantinuum_backend = provider.get_backend("qtm_h2-1e_simulator")

job_q = quantinuum_backend.run(qc, shots=500)
result_q = job_q.result()
print(result_q.get_counts())
# {'00': ~250, '11': ~250}

Superstaq compiled the same Bell state circuit to Quantinuum’s native gate set (which uses ZZ interactions and U1q single-qubit rotations) automatically. You did not change a line of circuit code.

To confirm, inspect the compiled Quantinuum circuit:

compiled_q = provider.compile(qc, target="qtm_h2-1e_simulator")
print(compiled_q.circuit.draw())
# Gates will be different from the IonQ compiled version

Cirq Integration

If you prefer Google’s Cirq framework, Superstaq offers a parallel integration through the cirq-superstaq package. The workflow mirrors the Qiskit version closely: create a service object (instead of a provider), build a Cirq circuit, and submit or compile.

Setting Up the Cirq Service

import cirq
import cirq_superstaq

# Create the Superstaq service (reads SUPERSTAQ_API_KEY from environment)
service = cirq_superstaq.Service()

# List available backends
backends = service.get_targets()
for target in backends:
    print(target)

Building a Bell State in Cirq

import cirq
import cirq_superstaq

# Define qubits
q0, q1 = cirq.LineQubit.range(2)

# Build the Bell state circuit
circuit = cirq.Circuit([
    cirq.H(q0),           # Hadamard on qubit 0
    cirq.CNOT(q0, q1),    # CNOT: entangle qubits 0 and 1
    cirq.measure(q0, q1, key="result"),  # Measure both qubits
])

print(circuit)
# 0: ───H───@───M('result')───
#           │   │
# 1: ───────X───M─────────────

Submitting via Cirq

service = cirq_superstaq.Service()

# Submit to IonQ simulator
job = service.create_job(
    circuit=circuit,
    target="ionq_ion_simulator",
    repetitions=1024,
)

print("Job ID:", job.job_id())

# Get results
result = job.counts()
print(result)
# {'00': 512, '11': 512}  -- approximate

Compiling via Cirq

# Compile for the Quantinuum emulator
compiled = service.compile(
    circuit=circuit,
    target="qtm_h2-1e_simulator",
)

print(compiled.circuit)

The key difference between the Qiskit and Cirq interfaces:

In Qiskit, you use SuperstaqProvider() and call provider.get_backend() to get a backend object, then call backend.run().
In Cirq, you use Service() and call service.create_job() directly with a target string.

Both produce the same compiled circuits and submit to the same backends. Choose whichever framework you are more comfortable with.

Compile-Only Workflow: Comparing Backends Before Running

One of the most practical Superstaq workflows is compiling your circuit for multiple backends without running it. This lets you compare gate counts and circuit depths to make an informed decision about where to run, before spending money on hardware shots.

from qiskit import QuantumCircuit
import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()

# Build a Bell state
qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# Compile for three different backends
targets = [
    "ionq_ion_simulator",
    "qtm_h2-1e_simulator",
    "ibmq_fez_qpu",
]

print(f"{'Backend':<30} {'Total Gates':<15} {'2Q Gates':<12} {'Depth':<8}")
print("-" * 65)

for target in targets:
    compiled = provider.compile(qc, target=target)
    cc = compiled.circuit

    # Remove measurement gates from the count
    ops = cc.count_ops()
    total_gates = sum(v for k, v in ops.items() if k != 'measure')

    # Count two-qubit gates (common names for native 2Q gates)
    two_qubit_names = {'cx', 'cz', 'ms', 'zzphase', 'zz', 'ecr', 'rzz'}
    two_q_count = sum(v for k, v in ops.items() if k.lower() in two_qubit_names)

    depth = cc.depth()

    print(f"{target:<30} {total_gates:<15} {two_q_count:<12} {depth:<8}")

This comparison table shows you which backend requires the fewest two-qubit gates (the noisiest operations) and the shallowest circuit depth. For a simple Bell state the differences are modest, but for larger circuits the variation across backends can be dramatic.

Three-Qubit GHZ State

The GHZ (Greenberger-Horne-Zeilinger) state extends entanglement to three or more qubits. Where a Bell state entangles two qubits into (|00⟩ + |11⟩)/√2, a three-qubit GHZ state produces (|000⟩ + |111⟩)/√2. This is a fundamental resource state for quantum error correction, quantum secret sharing, and multiparty entanglement experiments.

Building the GHZ Circuit

from qiskit import QuantumCircuit
import qiskit_superstaq

# Three-qubit GHZ state
ghz = QuantumCircuit(3, 3)
ghz.h(0)          # Put qubit 0 in superposition
ghz.cx(0, 1)      # Entangle qubit 0 with qubit 1
ghz.cx(1, 2)      # Extend entanglement to qubit 2
ghz.measure([0, 1, 2], [0, 1, 2])

print(ghz.draw())
#      ┌───┐           ░ ┌─┐
# q_0: ┤ H ├──■────────░─┤M├──────
#      └───┘┌─┴─┐      ░ └─┘┌─┐
# q_1: ─────┤ X ├──■───░────┤M├───
#           └───┘┌─┴─┐ ░    └─┘┌─┐
# q_2: ──────────┤ X ├─░───────┤M├
#                └───┘ ░       └─┘

Compiling for IonQ and Quantinuum

provider = qiskit_superstaq.SuperstaqProvider()

# Compile for IonQ
compiled_ionq = provider.compile(ghz, target="ionq_ion_simulator")
print("=== IonQ compiled circuit ===")
print(compiled_ionq.circuit.draw())
print("Gate counts:", compiled_ionq.circuit.count_ops())

# Compile for Quantinuum
compiled_q = provider.compile(ghz, target="qtm_h2-1e_simulator")
print("\n=== Quantinuum compiled circuit ===")
print(compiled_q.circuit.draw())
print("Gate counts:", compiled_q.circuit.count_ops())

The GHZ circuit has two CNOT gates, so each backend needs two native two-qubit gates at minimum. Superstaq may sometimes optimize this: on platforms where the two-qubit gate has specific symmetries, it can fuse operations or reorder gates to reduce overhead.

On IonQ, each CNOT becomes an MS gate with surrounding single-qubit corrections. Because trapped-ion systems have all-to-all connectivity, no SWAP gates are needed even though qubit 0 must interact with qubit 1, and qubit 1 with qubit 2. The compiler can place these operations directly.

On Quantinuum, the CNOT chain becomes ZZPhase gates with Rz and PhasedX corrections. Quantinuum’s architecture also supports all-to-all connectivity, so no routing overhead is required.

On superconducting platforms (IBM, Rigetti), the story can be different. If qubits 0 and 2 are not physically adjacent, the compiler may need to insert SWAP gates, increasing the two-qubit gate count. Superstaq’s routing algorithm selects the physical qubit mapping that minimizes this overhead by considering the chip’s topology and current calibration data.

Gate Counting and Optimization Metrics

When evaluating circuit quality on near-term quantum hardware, three metrics matter most:

Total gate count: fewer gates means less accumulated error.
Two-qubit gate count: this is the single most important metric. Two-qubit gates have error rates 10x to 100x higher than single-qubit gates on most platforms. A circuit with 5 two-qubit gates will produce dramatically better results than one with 15.
Circuit depth: the number of sequential time steps. Shallower circuits finish faster, giving qubits less time to decohere. Gates that can execute in parallel (on different qubits) share a time step.

Measuring Optimization on a Larger Circuit

To see meaningful optimization, you need a circuit with enough gates that the compiler has room to improve. Here is a 4-qubit circuit with several CNOT gates:

from qiskit import QuantumCircuit
import qiskit_superstaq

# Build a 4-qubit circuit with multiple CNOT layers
qc = QuantumCircuit(4, 4)

# Layer 1: single-qubit gates
qc.h(0)
qc.h(1)
qc.rx(0.5, 2)
qc.ry(1.2, 3)

# Layer 2: entangling gates
qc.cx(0, 1)
qc.cx(2, 3)

# Layer 3: more single-qubit gates
qc.rz(0.8, 0)
qc.rx(1.1, 1)
qc.h(2)
qc.s(3)

# Layer 4: more entangling gates
qc.cx(1, 2)
qc.cx(0, 3)

# Layer 5: final entangling layer
qc.cx(3, 0)
qc.cx(2, 1)

# Layer 6: a few more single-qubit gates before measurement
qc.h(0)
qc.rz(0.3, 1)
qc.rx(0.7, 2)
qc.h(3)

qc.measure([0, 1, 2, 3], [0, 1, 2, 3])

# Print original circuit stats
original_ops = qc.count_ops()
original_2q = sum(v for k, v in original_ops.items() if k in {'cx', 'cz', 'ecr'})
print(f"Original circuit: {sum(v for k, v in original_ops.items() if k != 'measure')} gates, "
      f"{original_2q} two-qubit gates, depth {qc.depth()}")

# Compile for multiple backends and compare
provider = qiskit_superstaq.SuperstaqProvider()

targets = [
    ("ionq_ion_simulator", "IonQ"),
    ("qtm_h2-1e_simulator", "Quantinuum"),
    ("ibmq_fez_qpu", "IBM Fez"),
]

print(f"\n{'Backend':<20} {'Total Gates':<15} {'2Q Gates':<12} {'Depth':<8} {'Reduction'}")
print("-" * 70)

for target, label in targets:
    compiled = provider.compile(qc, target=target)
    cc = compiled.circuit
    ops = cc.count_ops()

    total = sum(v for k, v in ops.items() if k != 'measure')
    two_qubit_names = {'cx', 'cz', 'ms', 'zzphase', 'zz', 'ecr', 'rzz'}
    two_q = sum(v for k, v in ops.items() if k.lower() in two_qubit_names)
    depth = cc.depth()

    print(f"{label:<20} {total:<15} {two_q:<12} {depth:<8} "
          f"{'+' if two_q > original_2q else ''}{two_q - original_2q} 2Q gates")

On trapped-ion backends (IonQ, Quantinuum), you often see the two-qubit gate count stay the same or decrease, because these platforms have all-to-all connectivity and need no SWAP gates. On superconducting backends, the count may increase due to routing overhead, but Superstaq minimizes the increase by choosing optimal qubit layouts.

The key takeaway: always check the two-qubit gate count after compilation. If one backend requires significantly fewer two-qubit gates for your circuit, that backend will likely give you higher-fidelity results, even if other factors like raw gate fidelity are similar.

Checking Job Status Without Blocking

For longer jobs (hardware queues can be minutes to hours), check status without blocking:

job = backend.run(qc, shots=1024)

# Check status non-blocking
status = job.status()
print(status)  # <JobStatus.QUEUED: 'job is queued'>

# Later, when you are ready to get results
result = job.result()  # blocks until done

Job Management for Hardware Runs

When you submit circuits to real quantum hardware, jobs can spend minutes or hours in a queue. Your Python session might time out, your laptop might go to sleep, or you might simply want to check results the next day. Superstaq assigns each job a unique ID that you can use to retrieve results later.

Saving and Retrieving Job IDs

import json
import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()
backend = provider.get_backend("ionq_aria-1_qpu")

# Submit a job
job = backend.run(qc, shots=1024)
job_id = job.job_id()
print(f"Job submitted: {job_id}")

# Save the job ID to a file for later retrieval
job_record = {
    "job_id": job_id,
    "backend": "ionq_aria-1_qpu",
    "shots": 1024,
    "circuit_description": "Bell state",
}

with open("superstaq_jobs.json", "w") as f:
    json.dump(job_record, f, indent=2)

print("Job ID saved to superstaq_jobs.json")

Retrieving a Job Later

import json
import qiskit_superstaq

# Load the saved job record
with open("superstaq_jobs.json", "r") as f:
    job_record = json.load(f)

# Reconnect to Superstaq
provider = qiskit_superstaq.SuperstaqProvider()
backend = provider.get_backend(job_record["backend"])

# Retrieve the job using the saved ID
job = backend.retrieve_job(job_record["job_id"])

# Check if it is done
status = job.status()
print(f"Job status: {status}")

if status.name == "DONE":
    result = job.result()
    print("Results:", result.get_counts())
else:
    print(f"Job is still {status.name}. Check back later.")

Polling Status Without Blocking

For scripts that need to wait for completion without locking up your terminal, use a polling loop with a reasonable interval:

import time
import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()
backend = provider.get_backend("ionq_aria-1_qpu")
job = backend.run(qc, shots=1024)

# Poll every 60 seconds
max_wait_minutes = 120
polls = 0

while polls < max_wait_minutes:
    status = job.status()
    print(f"[{polls} min] Status: {status}")

    if status.name == "DONE":
        result = job.result()
        print("Results:", result.get_counts())
        break
    elif status.name == "ERROR":
        print("Job failed. Check the Superstaq dashboard for details.")
        break

    time.sleep(60)
    polls += 1

if polls >= max_wait_minutes:
    print(f"Job did not complete within {max_wait_minutes} minutes.")
    print(f"Save job ID {job.job_id()} and check later.")

Handling Failed Jobs

Hardware jobs can fail for various reasons: calibration cycles, hardware downtime, or queue timeouts. When a job fails, retrieve the error information and resubmit:

status = job.status()
if status.name == "ERROR":
    print("Job failed.")
    print("Resubmitting to the same backend...")

    # Resubmit the circuit
    new_job = backend.run(qc, shots=1024)
    print(f"New job ID: {new_job.job_id()}")

    # Update your saved records
    job_record["job_id"] = new_job.job_id()
    with open("superstaq_jobs.json", "w") as f:
        json.dump(job_record, f, indent=2)

Real Hardware vs Simulators

Target	Type	Notes
`ionq_ion_simulator`	Ideal simulator	No noise, instant, free tier available
`ionq_aria-1_qpu`	Real hardware	25 qubits, queue time, per-shot cost
`ionq_forte-1_qpu`	Real hardware	36 qubits, queue time, per-shot cost
`qtm_h2-1e_simulator`	Noise emulator	Models H2-1 hardware noise
`qtm_h2-1_qpu`	Real hardware	56 qubits, Quantinuum H2 series
`cq_sqale_qpu`	Real hardware	Infleqtion neutral atom (Cs atoms)

Start with simulators. They are faster, cheaper, and let you verify your circuit logic before spending budget on hardware time.

Backend Selection Strategy

Choosing the right backend for your task depends on what you are trying to accomplish. Here is a structured approach:

Step 1: Are You Debugging or Running for Real?

If you are still developing and testing your circuit, use a simulator. Simulators are free (or very cheap), return results instantly, and let you iterate quickly. Good choices:

ionq_ion_simulator for ideal (noiseless) simulation
qtm_h2-1e_simulator for noisy simulation that models real hardware behavior

Only move to real hardware after your circuit produces correct results on a simulator.

Step 2: Do You Need Noise-Free Results or Realistic Results?

Ideal simulators (like ionq_ion_simulator) give you perfect, noiseless results. Use these to verify that your circuit logic is correct.
Noise emulators (like qtm_h2-1e_simulator) model the actual noise characteristics of hardware. Use these to estimate how your circuit will perform on real hardware before spending money.

Step 3: Choosing Hardware

When you are ready for real hardware, consider these factors:

For maximum accuracy, choose the backend where your compiled circuit has the fewest two-qubit gates. Use the compile-only workflow from the previous section to check this. Trapped-ion platforms (IonQ, Quantinuum) often win for circuits that need many distant qubit interactions, because their all-to-all connectivity avoids SWAP overhead.

For fastest results, check queue availability. Hardware queues vary by time of day and day of week. Superstaq’s dashboard or API can show current queue depths. If one platform has a 2-hour queue and another has a 20-minute queue, the faster queue might be worth a small fidelity trade-off for prototyping.

For lowest cost, consider the pricing model. IonQ charges per shot, Quantinuum uses “HQC” (Quantinuum credits) based on circuit complexity, and IBM offers free tier access for small circuits. For prototyping, free-tier simulators are always the cheapest option.

For large qubit counts, check the backend’s qubit count. As of mid-2026, IonQ Aria has 25 qubits, IonQ Forte has 36, and Quantinuum H2 has 56 (the 20-qubit H1 was retired in October 2025). If your circuit needs 30 qubits, some backends simply cannot run it.

Decision Summary

The decision flow in plain text:

Circuit still in development? Use ionq_ion_simulator (ideal, free).
Want to estimate real hardware performance? Use qtm_h2-1e_simulator (noisy, free/cheap).
Ready for hardware and need best accuracy? Compile to all backends, pick the one with fewest 2Q gates.
Ready for hardware and need fast turnaround? Check queue lengths, pick shortest.
Need many qubits? Check backend qubit counts, eliminate those that are too small.

How Infleqtion’s Own Hardware Fits In

Infleqtion builds neutral atom quantum computers using cesium atoms. Superstaq is Infleqtion’s compiler, so it has particularly good support for compiling to Infleqtion’s own hardware (cq_ targets). If you have access to Infleqtion hardware, use the same workflow:

infleqtion_backend = provider.get_backend("cq_sqale_qpu")
job = infleqtion_backend.run(qc, shots=200)
result = job.result()
print(result.get_counts())

Neutral Atom Advantages

Neutral atom quantum computers work on fundamentally different physics from superconducting and trapped-ion systems, and those differences create practical advantages for certain workloads.

The physics. Individual cesium atoms are held in place by optical tweezers: tightly focused laser beams that create microscopic potential wells. Each atom acts as a qubit, with two internal energy levels encoding |0⟩ and |1⟩. To perform a two-qubit gate, the atoms are excited to high-energy Rydberg states, where they interact strongly over relatively long distances.

All-to-all connectivity. Because the optical tweezers can physically move atoms, any qubit can be brought close enough to any other qubit to perform a direct two-qubit gate. This is a major advantage over superconducting chips, where qubits are fixed in place and can only interact with their nearest neighbors. On a superconducting chip, connecting two distant qubits requires a chain of SWAP gates; on a neutral atom system, you just move the atoms.

Mid-circuit measurement and qubit reuse. Neutral atom systems support measuring individual qubits in the middle of a circuit (not just at the end) and then reusing those qubits for further computation. This enables advanced protocols like quantum error correction, repeat-until-success circuits, and dynamic circuit patterns.

Connectivity Advantage in Practice

Consider a circuit where qubit 0 needs to interact with every other qubit:

from qiskit import QuantumCircuit
import qiskit_superstaq

# A "star" pattern: qubit 0 entangles with all others
star = QuantumCircuit(5, 5)
star.h(0)
star.cx(0, 1)
star.cx(0, 2)
star.cx(0, 3)
star.cx(0, 4)
star.measure_all()

print("Original circuit:")
print(f"  CNOT gates: {star.count_ops().get('cx', 0)}")
print(f"  Depth: {star.depth()}")

On a neutral atom system (or any all-to-all connected platform), this circuit needs exactly 4 two-qubit gates, because qubit 0 can interact directly with qubits 1, 2, 3, and 4.

On a superconducting chip with linear connectivity (each qubit only connects to its neighbors), reaching qubit 4 from qubit 0 requires SWAP gates. The compiled circuit might need 7 or more two-qubit gates instead of 4. That is nearly double the error budget.

provider = qiskit_superstaq.SuperstaqProvider()

# Compile for a neutral atom target
compiled_na = provider.compile(star, target="cq_sqale_qpu")
na_ops = compiled_na.circuit.count_ops()
print(f"\nNeutral atom compiled: {na_ops}")

# Compare with a superconducting target
compiled_ibm = provider.compile(star, target="ibmq_fez_qpu")
ibm_ops = compiled_ibm.circuit.count_ops()
print(f"IBM compiled: {ibm_ops}")

The connectivity advantage grows with circuit size. For algorithms like QAOA (Quantum Approximate Optimization Algorithm), where the circuit structure mirrors a problem graph that may have many long-range connections, neutral atom and trapped-ion platforms can require dramatically fewer physical gates than superconducting alternatives.

Pulse-Level Optimization

Standard compilation works at the gate level: it decomposes your circuit into native gates and optimizes the sequence. But there is a deeper level of optimization available.

At the pulse level, instead of treating each gate as a fixed operation, Superstaq can fuse multiple gates into a single shaped microwave or laser pulse. This pulse performs the same unitary transformation as the gate sequence, but in less time and often with higher fidelity.

Why does this help? Each gate has overhead: the pulse must ramp up, maintain a calibrated amplitude, and ramp down. When you chain multiple gates together, the gaps between pulses contribute to decoherence. A single fused pulse eliminates those gaps.

Superstaq exposes pulse-level optimization through the method parameter in compile calls:

import qiskit_superstaq

provider = qiskit_superstaq.SuperstaqProvider()

# Standard gate-level compilation
compiled_gates = provider.compile(qc, target="ionq_ion_simulator")

# Request pulse-level (optimal control) compilation
compiled_pulses = provider.compile(
    qc,
    target="ionq_ion_simulator",
    method="optimized_compilation",
)

print("Gate-level compiled gates:", compiled_gates.circuit.count_ops())
print("Pulse-optimized compiled gates:", compiled_pulses.circuit.count_ops())

The pulse-optimized version may show fewer gates because multi-gate sequences have been fused into single operations. The actual pulse shapes are handled internally by Superstaq; you see the result as a circuit with fewer, potentially custom, gates.

Pulse-level optimization is most impactful for:

Circuits with many consecutive single-qubit gates on the same qubit
Circuits where gate cancellation is possible (e.g., an RZ followed by another RZ can be fused into a single rotation)
Performance-critical circuits where every microsecond of execution time matters

Note that pulse-level optimization is not available for all backends and may require specific access permissions. Check the Superstaq documentation for current backend support.

Common Mistakes

When working with Superstaq for the first time, these are the most frequent issues people encounter.

1. Forgetting to Set the API Key Before Importing

The Superstaq provider reads the SUPERSTAQ_API_KEY environment variable at initialization time. If you set it after creating the provider, it will not pick up the key.

# WRONG: setting the key after creating the provider
import qiskit_superstaq
provider = qiskit_superstaq.SuperstaqProvider()  # Fails here
import os
os.environ["SUPERSTAQ_API_KEY"] = "your-key"

# CORRECT: set the key first
import os
os.environ["SUPERSTAQ_API_KEY"] = "your-key"

import qiskit_superstaq
provider = qiskit_superstaq.SuperstaqProvider()  # Works

2. Using Backend Names Incorrectly

Backend names must be spelled exactly right. Superstaq does not fuzzy-match backend names. A small typo will raise an error.

# WRONG: these will fail
backend = provider.get_backend("ionq-simulator")     # hyphen instead of underscore
backend = provider.get_backend("IonQ_Simulator")      # wrong capitalization
backend = provider.get_backend("qtm_h2-1e")           # missing _simulator suffix

# CORRECT: use exact names
backend = provider.get_backend("ionq_ion_simulator")
backend = provider.get_backend("qtm_h2-1e_simulator")

When in doubt, list all backends first:

for b in provider.backends():
    print(b.name())

3. Submitting to Hardware Without Testing on a Simulator First

Hardware time costs money and takes time. Always verify your circuit on a simulator before moving to hardware.

# Step 1: test on simulator (fast, free)
sim_backend = provider.get_backend("ionq_ion_simulator")
sim_job = sim_backend.run(qc, shots=1024)
sim_result = sim_job.result()
print("Simulator results:", sim_result.get_counts())

# Step 2: only after confirming correct behavior, submit to hardware
hw_backend = provider.get_backend("ionq_aria-1_qpu")
hw_job = hw_backend.run(qc, shots=1024)

4. Expecting Noiseless Results from Emulators

The Quantinuum emulator targets (device names with an e suffix, like qtm_h2-1e_simulator) are noise emulators, not ideal simulators. They model the actual noise characteristics of the corresponding hardware. Your results will include noise-induced errors, just like real hardware.

# This emulator includes noise -- results will not be perfect
emulator = provider.get_backend("qtm_h2-1e_simulator")
job = emulator.run(qc, shots=1000)
result = job.result()
counts = result.get_counts()
# You might see: {'00': 480, '11': 475, '01': 22, '10': 23}
# The 01 and 10 counts are noise, not bugs in your circuit

If you want perfect noiseless results for debugging, use an ideal simulator like ionq_ion_simulator.

5. Modifying Compiled Circuits Before Submission

When you compile a circuit with Superstaq, the result is optimized for a specific backend’s native gate set and qubit topology. If you manually modify the compiled circuit (adding gates, removing gates, or changing qubit assignments), you can break the optimization and produce a circuit that does not run correctly on the target hardware.

compiled = provider.compile(qc, target="ionq_ion_simulator")

# DO NOT do this:
compiled.circuit.h(0)  # Adding a gate breaks the optimization

# Instead, modify your ORIGINAL circuit and recompile:
qc.h(0)
compiled = provider.compile(qc, target="ionq_ion_simulator")

6. Not Saving Job IDs for Long-Running Hardware Jobs

Hardware jobs can take hours. If your Python session ends before the job completes, you lose the job ID and cannot retrieve results.

# Always save the job ID immediately after submission
job = backend.run(qc, shots=1024)
job_id = job.job_id()
print(f"SAVE THIS: {job_id}")

# Better yet, write it to a file (see the Job Management section above)

Make it a habit to save job IDs to a file as shown in the Job Management section. This is especially important for expensive jobs with many shots or complex circuits.

What to Try Next

Use the compile method to study how gate decompositions differ between backends before spending on hardware shots
Experiment with larger circuits (5+ qubits) to see more dramatic differences in compiled gate counts across backends
Look at Superstaq’s support for optimal control pulse-level compilation for deeper optimization
Try the QAOA (Quantum Approximate Optimization Algorithm) workflow, where cross-platform compilation can make a significant difference in circuit fidelity
See the Superstaq Reference for the full API and backend list
Compare with tket, another cross-platform compiler with a different approach to optimization