Quantum Phase Estimation in Qiskit

What Quantum Phase Estimation Does

Given a unitary operator U and one of its eigenstates |psi>, quantum phase estimation (QPE) finds the phase theta such that:

U|psi> = e^{2*pi*i*theta} |psi>

The value theta is a real number between 0 and 1. Classical computers cannot efficiently compute this phase for large unitaries. QPE uses quantum parallelism to extract it by encoding theta into the amplitudes of an ancilla register, then reading it out via an inverse quantum Fourier transform.

QPE is the engine behind two of the most important quantum algorithms: Shor’s algorithm (which uses it to find the order of a modular exponentiation) and quantum chemistry simulations (which use it to extract ground state energies through phase kickback).

Circuit Structure

The QPE circuit has three stages:

1. Ancilla register in superposition. Prepare n ancilla qubits in the uniform superposition |+>^n by applying a Hadamard to each. These n bits determine the precision of the phase estimate: you recover theta to n binary places, meaning the error is at most 1/2^n.

2. Controlled-U gates. Apply controlled-U^(2^k) for k = 0, 1, …, n-1, where each ancilla qubit k controls the k-th power of U acting on the target register. This encodes the phase into the ancilla via phase kickback:

controlled-U^(2^k) (|+> |psi>) = (|0> + e^{2*pi*i*2^k*theta} |1>) |psi>
                                               (up to normalization)

3. Inverse QFT and measurement. The ancilla register now holds a state whose phases are the binary fractions of theta. Applying the inverse QFT converts these phases into computational basis amplitudes, so measuring the ancilla yields the binary representation of theta.

Full QPE circuit:

|0>^n  --[H^n]-- [ctrl-U^1] [ctrl-U^2] ... [ctrl-U^{2^{n-1}}] --[QFT+]-- [Measure]
|psi>  -----------------------------------------------U^(2^k) gates-----------------

Implementing QPE for the T Gate

The T gate has eigenvalue e^{ipi/4} = e^{2pii(1/8)}, so theta = 1/8 = 0.125 = 0.001 in binary. With 3 ancilla qubits, QPE should recover this exactly.

from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT
import numpy as np

def build_qpe_t_gate(n_ancilla=3):
    """
    QPE for the T gate: eigenvalue e^{i*pi/4}, phase = 1/8.
    Target register: 1 qubit initialized to |1> (eigenstate of T with phase 1/8).
    Ancilla register: n_ancilla qubits (precision bits).
    """
    n_total = n_ancilla + 1
    qc = QuantumCircuit(n_total, n_ancilla)

    # Target qubit is q[0]; ancilla is q[1..n_ancilla]
    # Initialize target in eigenstate |1>
    qc.x(0)

    # Stage 1: Hadamard on all ancilla qubits
    for k in range(1, n_ancilla + 1):
        qc.h(k)

    qc.barrier()

    # Stage 2: Controlled-T^(2^k) gates
    # Ancilla qubit k (index k in circuit, ancilla index k-1) controls T^(2^{k-1})
    # T^m = Rz(m * pi/4)
    for k in range(1, n_ancilla + 1):
        power = 2 ** (k - 1)
        angle = power * np.pi / 4
        # Controlled-Rz(angle) controlled on ancilla qubit k, targeting qubit 0
        qc.cp(angle, k, 0)

    qc.barrier()

    # Stage 3: Inverse QFT on ancilla register
    # QFT acts on qubits 1..n_ancilla; we need inverse
    qft_inv = QFT(n_ancilla, inverse=True)
    qc.append(qft_inv, range(1, n_ancilla + 1))

    qc.barrier()

    # Measure ancilla register into classical bits
    qc.measure(range(1, n_ancilla + 1), range(n_ancilla))

    return qc

qc = build_qpe_t_gate(n_ancilla=3)
print(qc.decompose().draw(fold=80))

sim = AerSimulator()
compiled = transpile(qc, sim)
result = sim.run(compiled, shots=1024).result()
counts = result.get_counts()
print(counts)

The dominant measurement result should be 001 (reading the classical bits in order). Converting this to a phase:

def bitstring_to_phase(bitstring):
    """
    Convert a measured bitstring to the estimated phase theta in [0, 1).
    The first bit is the most significant bit of the binary fraction.
    """
    n = len(bitstring)
    value = 0
    for i, bit in enumerate(bitstring):
        value += int(bit) * (2 ** -(i + 1))
    return value

# Most likely measurement
top_result = max(counts, key=counts.get)
estimated_phase = bitstring_to_phase(top_result)
print(f"Measured bitstring: {top_result}")
print(f"Estimated phase:    {estimated_phase:.4f}")
print(f"True phase (1/8):   {0.125:.4f}")
# Output:
# Measured bitstring: 001
# Estimated phase:    0.1250
# True phase (1/8):   0.1250

The bitstring 001 means 0/2 + 0/4 + 1/8 = 0.125, which is an exact match because 1/8 is exactly representable in 3 binary digits.

Precision and Ancilla Count

The number of ancilla qubits n determines both precision and circuit depth:

Ancilla qubits	Phase precision	Max error
3	1/8 = 0.125	0.0625
4	1/16 = 0.0625	0.03125
8	1/256	~0.004
12	1/4096	~0.0002

For chemistry applications, you typically need 10 to 20 ancilla qubits to recover energies in chemical accuracy (1 kcal/mol ~ 1.6 mHartree). More ancilla also means more controlled-U gates, which are often expensive to decompose.

If the phase is not exactly representable in n bits, the measurement outcome is probabilistic across neighboring binary fractions. The inverse QFT concentrates probability mass on the two nearest representable fractions.

Running on Statevector Simulator

For exact amplitudes without shot noise, use the statevector simulator:

from qiskit_aer import AerSimulator
from qiskit import transpile

# Build without measurements for statevector
def build_qpe_no_measure(n_ancilla=3):
    qc = QuantumCircuit(n_ancilla + 1)
    qc.x(0)
    for k in range(1, n_ancilla + 1):
        qc.h(k)
    for k in range(1, n_ancilla + 1):
        power = 2 ** (k - 1)
        angle = power * np.pi / 4
        qc.cp(angle, k, 0)
    qft_inv = QFT(n_ancilla, inverse=True)
    qc.append(qft_inv, range(1, n_ancilla + 1))
    qc.save_statevector()
    return qc

sv_sim = AerSimulator(method="statevector")
qc_sv = build_qpe_no_measure()
result = sv_sim.run(transpile(qc_sv, sv_sim)).result()
sv = result.get_statevector()

# Print the probability of each ancilla state (summed over target qubit).
# In Qiskit's statevector, qubit 0 is the LSB of the index.
# Qubit 0 is the target; qubits 1..n_ancilla are the ancilla.
# ancilla_idx = idx >> 1 gives bits from position 1 onward (qubits 1,2,3).
# The measurement maps qubit k -> creg[k-1], so creg[0]=qubit1 is the LSB
# of the printed bitstring. Reverse ancilla_idx bits to get measurement order.
import numpy as np
n_ancilla = 3
probs = np.abs(sv.data) ** 2
print("Ancilla state probabilities (ancilla is upper register):")
for ancilla_idx in range(2 ** n_ancilla):
    prob_state = sum(probs[(ancilla_idx << 1) | q0] for q0 in [0, 1])
    if prob_state > 0.001:
        # Convert ancilla_idx to measurement bitstring: bit 0 of ancilla_idx
        # corresponds to qubit 1 = creg[0] (LSB of bitstring).
        meas_bits = format(ancilla_idx, f"0{n_ancilla}b")[::-1]
        print(f"  |{meas_bits}> : {prob_state:.4f}")

Connection to Shor’s Algorithm

Shor’s algorithm for factoring N reduces to finding the period r of f(x) = a^x mod N. The period is encoded as the phase 1/r of a modular exponentiation unitary. QPE extracts this phase, and continued fraction expansion converts the fractional approximation into an exact rational p/q, giving the period r = q.

The key bottleneck is implementing controlled-U^k = controlled-(a^{2^k} mod N) as a quantum circuit. This requires modular arithmetic circuits, which is why Shor’s algorithm needs thousands of logical qubits for practically useful key sizes.

Connection to VQE

The Variational Quantum Eigensolver estimates ground state energies using a variational approach that avoids the deep circuits required by QPE. QPE is the more accurate alternative when fault-tolerant hardware is available: it directly extracts the eigenphase corresponding to the ground state energy. This is why QPE is considered a fault-tolerant algorithm and VQE a NISQ algorithm.

Limitations

Deep circuits. Each controlled-U^(2^k) gate may itself require many elementary gates. For chemistry, these circuits are thousands to millions of gates deep, far beyond current hardware.

State preparation. QPE requires the target register to be initialized close to the eigenstate |psi>. If your initial state has low overlap with the target eigenstate, QPE succeeds only with proportionally low probability.

Phase ambiguity. QPE recovers theta mod 1. If there are multiple eigenvalues, the measurement collapses to one of them with probability equal to the overlap of the initial state with the corresponding eigenstate.

No mid-circuit reuse without reset. Each ancilla qubit must remain coherent for the duration of the full circuit, which grows with n.

Summary

Property	Value
Problem	Estimate eigenphase theta of unitary U
Key subroutines	Controlled-U^k, inverse QFT
Precision	1/2^n for n ancilla qubits
Circuit depth	O(n) QFT + O(n) controlled-U layers
Applications	Shor’s algorithm, quantum chemistry, quantum simulation
Hardware requirement	Fault-tolerant (deep circuits)

QPE sits at the heart of fault-tolerant quantum computing. Mastering it is the gateway to understanding both Shor’s algorithm and the long-term promise of quantum chemistry simulation.