Quantum Resource Estimation: How Many Qubits Do You Actually Need?

Before investing in quantum hardware or building a business case around quantum advantage, you need to answer a concrete question: how many physical qubits and what runtime does my algorithm actually require on a fault-tolerant quantum computer?

The gap between “this algorithm is exponentially faster than classical” and “this algorithm can run on hardware you can buy” is measured in orders of magnitude. Resource estimation bridges that gap.

Why Resource Estimates Are Sobering

A first instinct is to think: if a quantum algorithm needs N logical qubits, and current hardware has N physical qubits, you are ready. This is wrong by several orders of magnitude.

Logical vs physical qubits. Fault-tolerant quantum computers encode each logical qubit in many physical qubits using an error-correcting code. The surface code with distance d uses approximately 2d^2 physical qubits per logical qubit. For a code distance needed to run Shor’s algorithm on RSA-2048 (d ~ 27), that is about 1,500 physical qubits per logical qubit.

T-gate overhead. Clifford gates (H, CNOT, S) can be applied directly to logical qubits fault-tolerantly. T gates (and any non-Clifford gate) require magic state distillation: a factory that consumes many physical qubits and multiple rounds of error correction to produce a single high-fidelity T gate. A single T gate factory requires 100-1,000 physical qubits and 10-100 rounds.

The dominant cost is T-gate factories. For algorithms with many T gates, the distillation factories often need more qubits than the algorithm itself. The algorithm that “needs 100 logical qubits” may actually need 50,000-500,000 physical qubits when you add distillation overhead.

Resources Required for Shor’s Algorithm

Factoring an n-bit number with Shor’s algorithm requires approximately:

Component	Formula	RSA-2048 (n=2048)
Logical data qubits	~2n	~4,096
Ancilla qubits	~n	~2,048
Toffoli/T budget	~0.3n^3 Toffolis	~2.7 billion Toffolis (roughly 10 billion T)
Physical qubits (surface code, Gidney-Ekera 2019)	full layout incl. factories	~20 million
Runtime (1 microsecond code cycle)	~T-count * round_time	~8 hours

These figures follow the widely cited Gidney-Ekera analysis (2019), which assumed 20 million noisy qubits running for about 8 hours. A 2025 follow-up by Gidney cut the requirement to under 1 million noisy qubits at the cost of a runtime of just under a week, using yoked surface codes and magic state cultivation.

This is why cryptographers are not panicking yet. Running Shor’s algorithm on RSA-2048 still requires on the order of a million error-corrected physical qubits with 0.1% gate error rates sustained for days, far beyond current hardware capabilities.

Estimation with Azure Quantum Resource Estimator

Microsoft’s Quantum Resource Estimator (QRE, originally launched as the Azure Quantum Resource Estimator) is the most user-friendly tool for this. It now runs entirely on your local machine and takes a Qiskit or Q# circuit as input and outputs:

Total physical qubit count
Breakdown: algorithm qubits vs T-factory qubits
Runtime
Required physical error rate

Python Example: Estimating Resources for a Circuit

# Install: pip install "qdk[qiskit]"
from qiskit import QuantumCircuit
from qiskit.circuit.library import QFT

# A 10-qubit quantum Fourier transform circuit
n_qubits = 10
qc = QuantumCircuit(n_qubits)
qc.compose(QFT(n_qubits), inplace=True)
qc.measure_all()

# The estimator runs locally; no Azure account or workspace is needed
from qsharp.estimator import EstimatorParams
from qsharp.interop.qiskit import estimate

params = EstimatorParams()
result = estimate(qc, params)

# Print summary
data = result.data()
print(f"Physical qubits: {data['physicalCounts']['physicalQubits']:,}")
print(f"Runtime: {data['physicalCounts']['runtime']}")
print(f"rQOPS: {data['physicalCounts']['rqops']}")  # reliable quantum operations per second

(Older docs show submitting estimation jobs to a cloud microsoft.estimator target through an Azure Quantum workspace; that flow has been retired, and the estimator now runs locally for free.)

Manual Resource Estimation

For rough estimates without the full toolchain, use this framework:

Step 1: Count T-gates

Decompose your circuit into Clifford + T gates. The T-count dominates cost. Useful approximations:

Arbitrary single-qubit rotation to precision epsilon: ~ 3 * log2(1/epsilon) T-gates
Toffoli gate: 7 T-gates (with ancilla)
Controlled rotation: ~same as arbitrary rotation
CNOT, H, S: 0 T-gates (Clifford)

For a VQE circuit with p parameters and L layers:

Each RZ(theta) rotation costs ~3 * log2(1/epsilon) T-gates
Total T-count: ~3 * p * L * log2(1/epsilon)
At epsilon = 0.001 and 100 parameters, 10 layers: ~3,000 T-gates

Step 2: Choose Code Distance

The logical error rate per T-gate must be much less than 1/T_count. For T_count = 10^9:

Required logical error rate: < 10^-10 per gate
Physical error rate p = 0.001, surface code: epsilon_L ~ 100 * (p/p_th)^((d+1)/2)
Set this < 10^-10, solve for d: d ~ 17-25 for typical near-term hardware

Step 3: Count Physical Qubits

def estimate_resources(
    n_logical: int,        # logical data qubits
    t_count: int,          # total T-gate count
    physical_error: float = 0.001,  # physical error rate
    code_threshold: float = 0.01,   # surface code threshold (~1%)
    rounds: int = 1        # rounds per T-gate distillation
) -> dict:
    import math
    
    # Code distance needed for logical error rate < 1/t_count
    # epsilon_L ~ 100 * (p/p_th)^((d+1)/2) < 1/t_count
    target_logical_error = 1 / (10 * t_count)
    r = physical_error / code_threshold
    d = 2 * math.ceil(math.log(100 * t_count) / math.log(1/r)) - 1
    d = max(d, 3)  # minimum distance
    
    physical_per_logical = 2 * d**2  # surface code
    algorithm_qubits = n_logical * physical_per_logical
    
    # T-factory: ~1000 physical qubits per factory, need t_count / rounds factories
    # In practice 1 factory used sequentially over many rounds
    # Factories_parallel = algorithm_latency / factory_latency (space-time tradeoff)
    factory_qubits = 1500  # typical small factory
    
    # Simple estimate: 1 factory running sequentially
    total_qubits = algorithm_qubits + factory_qubits
    
    # Runtime: each T-gate requires ~1000 rounds * d * gate_time
    gate_time_us = 0.1  # 100 ns per physical gate
    rounds_per_t = 1000 * d
    runtime_s = t_count * rounds_per_t * gate_time_us * 1e-6
    
    return {
        "code_distance": d,
        "algorithm_physical_qubits": algorithm_qubits,
        "factory_physical_qubits": factory_qubits,
        "total_physical_qubits": total_qubits,
        "runtime_hours": runtime_s / 3600,
    }

# VQE for small molecule (10 logical qubits, ~100,000 T-gates)
r1 = estimate_resources(n_logical=10, t_count=100_000)
print("VQE (small molecule):")
for k, v in r1.items():
    print(f"  {k}: {v:,.1f}" if isinstance(v, float) else f"  {k}: {v:,}")

print()

# Shor's algorithm for RSA-2048
r2 = estimate_resources(n_logical=6_000, t_count=8_600_000_000)
print("Shor RSA-2048:")
for k, v in r2.items():
    print(f"  {k}: {v:,.1f}" if isinstance(v, float) else f"  {k}: {v:,}")

Sample output:

VQE (small molecule):
  code_distance: 7
  algorithm_physical_qubits: 980
  factory_physical_qubits: 1,500
  total_physical_qubits: 2,480
  runtime_hours: 0.2

Shor RSA-2048:
  code_distance: 27
  algorithm_physical_qubits: 8,748,000
  factory_physical_qubits: 1,500
  total_physical_qubits: 8,749,500
  runtime_hours: 2,322.0

Reducing Resource Requirements

Reduce T-count

T-count optimization: compiler passes like those in TKET or Quilc can reduce T-gates by 10-50% for many circuits
Alternative decompositions: some rotations can be implemented with fewer T-gates using ancilla qubits (Repeat-Until-Success circuits)
Algorithm redesign: some algorithms have high-T-count variants and low-T-count variants (e.g., Hamiltonian simulation can use either LCU or Trotterization)

Reduce Qubit Count via Space-Time Tradeoffs

Running more T-factories in parallel reduces runtime at the cost of more qubits. Running fewer factories reduces qubit count but increases runtime. The optimal balance depends on which resource is more constrained.

Better Codes for Specific Hardware

Biased noise hardware: cat qubits or XZZX surface code can achieve the same logical error rate with smaller distance
Color codes: achieve some Clifford gates transversally, reducing T-factory overhead for some architectures
Concatenated codes: may be competitive with surface codes for certain error models

What Resource Estimation Tells You

A resource estimate tells you:

Whether your algorithm is relevant for near-term hardware (< 10 years) or requires large-scale fault-tolerant systems (10-30+ years)
Which component dominates cost (T-factories vs data qubits vs routing)
How sensitive the estimate is to physical error rate improvements
Whether there are architectural choices that can reduce the requirements

For most practical quantum chemistry problems (molecules up to 100 electrons), current estimates require millions of physical qubits with error rates 10x better than today’s best hardware. The timeline to practical quantum advantage on those problems is measured in decades, not years.

For smaller, well-structured optimization problems, the estimate is more optimistic: with error rates 5-10x better than current hardware, problems with 50-100 logical qubits and moderate T-counts could run on systems with 100,000-500,000 physical qubits; potentially achievable in 5-10 years.