Qiskit Advanced Free 33/61 in series 55 minutes

Quantum Phase Estimation: Deep Dive and Applications

A thorough treatment of quantum phase estimation covering circuit construction, precision trade-offs, the Hadamard test variant, iterative QPE, and applications in eigenvalue estimation, the HHL algorithm, and quantum simulation.

What you'll learn

  • quantum phase estimation
  • QPE
  • Hadamard test
  • eigenvalues
  • HHL
  • quantum simulation

Prerequisites

  • Strong Python skills
  • Solid quantum computing foundations
  • Linear algebra and complex numbers

What QPE Does

Quantum phase estimation (QPE) solves the following problem: given a unitary operator U and one of its eigenstates |u⟩, estimate the eigenvalue’s phase θ where U|u⟩ = e^(2πiθ)|u⟩ and θ ∈ [0, 1).

This is more useful than it might sound. Eigenvalues of unitaries directly correspond to energies in quantum simulation (via U = e^(-iHt)), to periods in Shor’s factoring algorithm, and to matrix condition numbers in the HHL linear-systems algorithm. QPE is one of the most reused subroutines in the quantum algorithm toolkit.

Circuit Construction

The standard QPE circuit uses:

  • t ancilla (counting) qubits, which control precision.
  • 1 or more qubits holding the eigenstate |u⟩.

Steps:

  1. Initialize counting qubits to |0⟩^t and eigenstate register to |u⟩.
  2. Apply Hadamard to all t counting qubits.
  3. Apply controlled-U^(2^k) from counting qubit k to the eigenstate register, for k = 0, 1, ..., t-1.
  4. Apply the inverse quantum Fourier transform (QFT†) to the counting register.
  5. Measure the counting register. The result is the binary fraction approximation of θ.

After step 2 the counting register holds the uniform superposition. Each controlled-U^(2^k) gate accrues a phase e^(2πi * 2^k * θ) onto qubit k via phase kickback (since U^(2^k)|u⟩ = e^(2πi * 2^k * θ)|u⟩). The counting register therefore encodes θ in its phase pattern. The inverse QFT transforms this phase pattern into a computational basis state encoding the binary representation of θ.

Full Mathematical Derivation

This section derives the QPE algorithm step by step, tracking the full quantum state at each stage.

Initial state

The system begins in the product state:

|Ψ_0⟩ = |0⟩^⊗t ⊗ |u⟩

where |u⟩ is an eigenstate of U with eigenvalue e^(2πiθ).

After Hadamard gates on the counting register

Applying H^⊗t to the counting register produces:

|Ψ_1⟩ = (1/√(2^t)) Σ_{k=0}^{2^t - 1} |k⟩ ⊗ |u⟩

Each basis state |k⟩ represents a t-bit integer. The counting register is now a uniform superposition of all 2^t computational basis states.

After controlled unitary applications

The circuit applies controlled-U^(2^j) from counting qubit j to the eigenstate register, for j = 0, 1, ..., t-1. Consider a single basis state |k⟩ in the superposition, where k = k_{t-1} k_{t-2} ... k_1 k_0 in binary. The controlled-U^(2^j) gate applies U^(2^j) when bit k_j = 1, contributing a phase e^(2πi * 2^j * θ * k_j). The combined action of all controlled gates on |k⟩ ⊗ |u⟩ is:

|k⟩ ⊗ |u⟩  →  e^(2πiθ * Σ_{j=0}^{t-1} k_j * 2^j) |k⟩ ⊗ |u⟩
         =  e^(2πiθk) |k⟩ ⊗ |u⟩

since Σ_{j=0}^{t-1} k_j * 2^j = k by the binary representation. The full state is therefore:

|Ψ_2⟩ = (1/√(2^t)) Σ_{k=0}^{2^t - 1} e^(2πiθk) |k⟩ ⊗ |u⟩

This is the central result: the counting register now holds the state (1/√(2^t)) Σ_k e^(2πiθk) |k⟩, which is exactly the quantum Fourier transform of the state |2^t θ⟩ (or the closest integer to it).

After inverse QFT

The quantum Fourier transform maps computational basis states to phase-encoded states:

QFT|m⟩ = (1/√(2^t)) Σ_{k=0}^{2^t - 1} e^(2πi mk / 2^t) |k⟩

Comparing with the counting register state in |Ψ_2⟩, we see it equals QFT|m⟩ when θ = m/2^t exactly. Applying the inverse QFT recovers:

QFT†: (1/√(2^t)) Σ_{k=0}^{2^t - 1} e^(2πiθk) |k⟩  →  |m⟩   (when θ = m/2^t)

The final state is |m⟩ ⊗ |u⟩, and measuring the counting register yields m with certainty. The phase estimate is θ_est = m / 2^t.

Non-exact phases

When θ ≠ m/2^t for any integer m, the inverse QFT does not produce a single basis state. Instead, the measurement outcome is probabilistic. The probability of measuring integer m is:

P(m) = (1/2^(2t)) |Σ_{k=0}^{2^t - 1} e^(2πi(θ - m/2^t)k)|^2

Evaluating this geometric sum gives the closed-form expression:

P(m) = sin²(π(2^t θ - m)) / (2^(2t) sin²(π(θ - m/2^t)))

This is a sinc-like distribution centered at m* = round(2^t θ). The closest integer m* has probability at least 4/π² ≈ 0.405, and the two closest integers together capture at least 8/π² ≈ 0.81 of the probability. Adding extra qubits beyond the log2(1/ε) minimum increases the success probability exponentially.

Phase Kickback: Detailed Derivation

Phase kickback is the mechanism that makes QPE work. It transfers eigenvalue phase information from the target register onto the control qubit, without disturbing the eigenstate.

Statement

Let U be a unitary with eigenvector |u⟩ and eigenvalue e^(iφ):

U|u⟩ = e^(iφ)|u⟩

Consider a controlled-U gate with the control qubit in state (|0⟩ + |1⟩)/√2 and the target register in |u⟩. The joint state evolves as:

(|0⟩ + |1⟩)/√2 ⊗ |u⟩
→ (|0⟩ ⊗ I|u⟩ + |1⟩ ⊗ U|u⟩) / √2
= (|0⟩ ⊗ |u⟩ + |1⟩ ⊗ e^(iφ)|u⟩) / √2
= (|0⟩ + e^(iφ)|1⟩) / √2  ⊗  |u⟩

The phase e^(iφ) has “kicked back” from the target onto the control qubit. The target register remains in |u⟩, completely undisturbed.

Why the eigenstate is not disturbed

The controlled-U gate acts as |0⟩⟨0| ⊗ I + |1⟩⟨1| ⊗ U. When the target is an eigenstate, U acts as scalar multiplication, so the target factors out of the tensor product. This factorization is what allows the phase to appear solely on the control qubit. If the target were not an eigenstate, entanglement between control and target would prevent this clean separation.

Concrete example: T gate and |1⟩

The T gate has the matrix diag(1, e^(iπ/4)). Its eigenstates are |0⟩ (eigenvalue 1) and |1⟩ (eigenvalue e^(iπ/4)). Consider controlled-T with control in |+⟩ = (|0⟩ + |1⟩)/√2 and target in |1⟩:

(|0⟩ + |1⟩)/√2 ⊗ |1⟩
→ (|0⟩ ⊗ |1⟩ + |1⟩ ⊗ T|1⟩) / √2
= (|0⟩ ⊗ |1⟩ + e^(iπ/4) |1⟩ ⊗ |1⟩) / √2
= (|0⟩ + e^(iπ/4)|1⟩) / √2  ⊗  |1⟩

The control qubit now carries the phase π/4, which encodes θ = 1/8 (since e^(iπ/4) = e^(2πi/8)). This is exactly the information that QPE extracts.

Phase kickback is relative

Note that the kickback phase is a relative phase between |0⟩ and |1⟩ on the control qubit. A global phase (applied to both components equally) would be unobservable. The key point is that controlled-U applies U only on the |1⟩ branch, creating a relative phase. This relative phase is then converted to a measurable quantity by the inverse QFT.

Controlled Power Implementations

The controlled-U^(2^k) gates are the most expensive part of QPE. How to implement them efficiently depends on the structure of U.

Diagonal unitaries

For diagonal unitaries like the T gate, U^n simply raises each diagonal entry to the n-th power. If U = diag(1, e^(iα)), then U^(2^k) = diag(1, e^(i * 2^k * α)). The controlled version is a single controlled-phase gate:

# Controlled-T^(2^k) is a single cp gate
for k in range(t):
    angle = 2**k * np.pi / 4  # T gate angle scaled by 2^k
    qc.cp(angle, control_qubit[k], target_qubit[0])

This is optimal: each controlled power costs exactly one two-qubit gate regardless of k.

Trotter product formulas

For Hamiltonians decomposed as H = Σ_j H_j, the time evolution U = e^(-iHt) can be approximated by Trotterization:

U ≈ (e^(-iH_1 t/n) e^(-iH_2 t/n) ... e^(-iH_m t/n))^n

To implement U^(2^k), you simply repeat the Trotter circuit 2^k times. For second-order Trotter (Suzuki formula):

U_2(t) = e^(-iH_1 t/2) e^(-iH_2 t/2) ... e^(-iH_m t) ... e^(-iH_2 t/2) e^(-iH_1 t/2)

The circuit depth for U^(2^k) scales as O(2^k * m) where m is the number of Pauli terms. This exponential growth in depth with k is the primary bottleneck of QPE for chemistry applications.

Qubitization and block encodings

For more sophisticated approaches, qubitization replaces Trotterization with a single query to a block encoding of the Hamiltonian. A block encoding embeds H/α (normalized by the spectral norm α) in the upper-left block of a larger unitary W:

W = [ H/α    *  ]
    [  *      *  ]

The walk operator W has eigenvalues e^(±i arccos(E_j/α)), where E_j are eigenvalues of H. QPE on W extracts these eigenvalues with query complexity O(α/ε) instead of the Trotter-based O(α²/ε), a quadratic improvement. The controlled-W^(2^k) operations each require 2^k queries to the block encoding oracle, but the per-query cost is independent of the system size.

Precision and Qubit Count

With t counting qubits you can estimate θ to t bits of precision, i.e., to within 1/2^t. More precisely, the probability of measuring a value within ε of θ is at least 1 - δ when:

t >= log2(1/ε) + log2(2 + 1/(2δ))

For ε = 0.01 and δ = 0.05 (95% success probability, 1% precision) you need roughly t = 8 to 10 counting qubits.

The depth cost comes from the controlled-U^(2^k) gates. If U requires depth d to implement, the controlled-U^(2^(t-1)) gate requires simulating 2^(t-1) applications of U. For large t this is expensive. Iterative QPE (described below) trades circuit depth for classical iterations.

Non-Exact Phase Behavior: Probability Distribution

When the true phase θ does not lie exactly on the t-bit grid, the QPE output is a probability distribution over all 2^t outcomes. Understanding this distribution is essential for assessing QPE accuracy.

The probability formula

For true phase θ and measured integer m with t counting qubits:

P(m) = (1/2^t)² |Σ_{k=0}^{2^t-1} e^(2πi(θ - m/2^t)k)|²

Let δ = θ - m/2^t. The geometric sum evaluates to:

Σ_{k=0}^{N-1} e^(2πiδk) = (1 - e^(2πiδN)) / (1 - e^(2πiδ))

where N = 2^t. Taking the modulus squared:

P(m) = sin²(πδN) / (N² sin²(πδ))

This is a periodic sinc-like function. It peaks sharply when δ → 0 (i.e., m/2^t is close to θ) and has sidelobes that decay as 1/sin²(πδ).

Qiskit simulation of non-exact phases

The following code runs QPE for θ = 0.1 with t = 4 counting qubits. Since 0.1 × 16 = 1.6, the true phase falls between grid points m = 1 and m = 2. The output distribution should peak at m = 2 (nearest grid point).

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT


def qpe_arbitrary_phase(theta: float, t: int) -> QuantumCircuit:
    """
    QPE for a single-qubit unitary U = diag(1, e^(2*pi*i*theta)).
    The eigenstate is |1> with eigenvalue e^(2*pi*i*theta).
    """
    counting = QuantumRegister(t, name="count")
    eigenstate = QuantumRegister(1, name="eig")
    cr = ClassicalRegister(t, name="c")
    qc = QuantumCircuit(counting, eigenstate, cr)

    # Prepare eigenstate |1>
    qc.x(eigenstate[0])

    # Hadamard on counting qubits
    qc.h(counting)

    # Controlled-U^(2^k): controlled phase gate with angle 2^k * 2*pi*theta
    for k in range(t):
        angle = 2**k * 2 * np.pi * theta
        qc.cp(angle, counting[k], eigenstate[0])

    # Inverse QFT on counting register
    iqft = QFT(t, inverse=True, do_swaps=True)
    qc.compose(iqft, qubits=counting, inplace=True)

    qc.measure(counting, cr)
    return qc


def run_non_exact_phase():
    theta = 0.1
    t = 4
    N = 2**t

    qc = qpe_arbitrary_phase(theta, t)
    sim = AerSimulator()
    compiled = transpile(qc, sim)
    job = sim.run(compiled, shots=8192)
    counts = job.result().get_counts()

    print(f"QPE for theta = {theta} with t = {t} counting qubits")
    print(f"Grid spacing: 1/{N} = {1/N:.4f}")
    print(f"Nearest grid point: {round(N * theta)}/{N} = {round(N * theta) / N:.4f}")
    print()

    # Compare measured vs theoretical distributions
    print(f"{'m':>4}  {'theta_est':>10}  {'P_theory':>10}  {'P_measured':>10}")
    print("-" * 42)

    for m in range(N):
        bitstring = format(m, f"0{t}b")
        measured_count = counts.get(bitstring, 0)
        p_measured = measured_count / 8192

        # Theoretical probability
        delta = theta - m / N
        if abs(delta) < 1e-12:
            p_theory = 1.0
        else:
            p_theory = (np.sin(np.pi * delta * N) ** 2) / (N**2 * np.sin(np.pi * delta) ** 2)

        if p_theory > 0.01 or p_measured > 0.01:
            print(f"{m:4d}  {m/N:10.4f}  {p_theory:10.4f}  {p_measured:10.4f}")


if __name__ == "__main__":
    run_non_exact_phase()

Expected output: the distribution peaks at m = 2 (θ_est = 0.125) with approximately 62% probability, with secondary weight at m = 1 (θ_est = 0.0625) near 22%. The remaining probability spreads across other bins with rapidly decreasing weights.

Qiskit Implementation: T-Gate Phase

The T gate has eigenvalue e^(iπ/4) = e^(2πi * 1/8), so the true phase is θ = 1/8 = 0.125. With 3 counting qubits we should recover θ = 001 in binary, i.e., the integer 1 out of 8.

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT


def qpe_t_gate(t: int = 3) -> QuantumCircuit:
    """
    QPE circuit estimating the phase of the T gate.
    T|1> = e^(i*pi/4)|1>, so theta = 1/8.
    The eigenstate is |1>.
    """
    counting = QuantumRegister(t, name="count")
    eigenstate = QuantumRegister(1, name="eig")
    cr = ClassicalRegister(t, name="c")
    qc = QuantumCircuit(counting, eigenstate, cr)

    # Prepare eigenstate |1> of T gate
    qc.x(eigenstate[0])

    # Hadamard on counting qubits
    qc.h(counting)
    qc.barrier()

    # Controlled-U^(2^k): controlled-T^(2^k)
    # T^(2^k) = P(2^k * pi/4) = Rz(2^k * pi/4) up to global phase
    for k in range(t):
        angle = 2 ** k * np.pi / 4
        qc.cp(angle, counting[k], eigenstate[0])

    qc.barrier()

    # Inverse QFT on counting register
    iqft = QFT(t, inverse=True, do_swaps=True)
    qc.compose(iqft, qubits=counting, inplace=True)

    qc.measure(counting, cr)
    return qc


def run_qpe():
    t = 3
    qc = qpe_t_gate(t)
    sim = AerSimulator()
    compiled = transpile(qc, sim)
    job = sim.run(compiled, shots=2048)
    counts = job.result().get_counts()

    print(f"QPE with t={t} counting qubits for T-gate phase (theta=1/8)")
    print(f"Raw counts: {counts}")

    for bitstring, count in sorted(counts.items(), key=lambda x: -x[1]):
        # Qiskit classical register: integer value is int(bitstring, 2)
        estimated_theta = int(bitstring, 2) / (2 ** t)
        print(f"  Bitstring: {bitstring} -> theta = {int(bitstring, 2)}/{2**t} = {estimated_theta:.4f} ({count} shots)")


if __name__ == "__main__":
    run_qpe()

Expected output: all 2048 shots produce bitstring 001, giving theta = 1/8 = 0.1250.

The Hadamard Test: Full Qiskit Implementation

When you do not have direct access to prepare |u⟩ but can apply U as a controlled operation, the Hadamard test estimates Re(⟨ψ|U|ψ⟩) for any state |ψ⟩. The circuit uses a single ancilla qubit:

  1. Initialize ancilla to |0⟩, target to |ψ⟩.
  2. Hadamard on ancilla.
  3. Controlled-U from ancilla to target.
  4. Hadamard on ancilla.
  5. Measure ancilla. Probability of 0 is (1 + Re(⟨ψ|U|ψ⟩)) / 2.

By inserting an S† gate before the final Hadamard you get the imaginary part instead. Together, real and imaginary parts give the full expectation value. This is useful for quantum chemistry energy estimation when the eigenstate is prepared variationally.

Qiskit implementation

The following code estimates both Re(⟨+|T|+⟩) and Im(⟨+|T|+⟩) using the Hadamard test. Analytically:

⟨+|T|+⟩ = (⟨0| + ⟨1|)(T|0⟩ + T|1⟩) / 2
         = (⟨0|0⟩ + ⟨1|e^(iπ/4)|1⟩) / 2  (cross terms vanish since T is diagonal)
         = (1 + e^(iπ/4)) / 2

Computing numerically: Re = (1 + cos(π/4)) / 2 ≈ 0.8536 and Im = sin(π/4) / 2 ≈ 0.3536.

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator


def hadamard_test_real(shots: int = 8192) -> float:
    """
    Estimate Re(<+|T|+>) using the Hadamard test.
    |psi> = |+> = H|0>, U = T gate.
    """
    qc = QuantumCircuit(2, 1)

    # Prepare |psi> = |+> on qubit 1
    qc.h(1)

    # Hadamard on ancilla (qubit 0)
    qc.h(0)

    # Controlled-T from ancilla to target
    qc.ct(0, 1)

    # Hadamard on ancilla
    qc.h(0)

    # Measure ancilla
    qc.measure(0, 0)

    sim = AerSimulator()
    compiled = transpile(qc, sim)
    job = sim.run(compiled, shots=shots)
    counts = job.result().get_counts()

    p0 = counts.get("0", 0) / shots
    # Re(<psi|U|psi>) = 2*P(0) - 1
    return 2 * p0 - 1


def hadamard_test_imag(shots: int = 8192) -> float:
    """
    Estimate Im(<+|T|+>) using the Hadamard test with S-dagger.
    """
    qc = QuantumCircuit(2, 1)

    # Prepare |psi> = |+> on qubit 1
    qc.h(1)

    # Hadamard on ancilla (qubit 0)
    qc.h(0)

    # Controlled-T from ancilla to target
    qc.ct(0, 1)

    # S-dagger on ancilla (rotates to extract imaginary part)
    qc.sdg(0)

    # Hadamard on ancilla
    qc.h(0)

    # Measure ancilla
    qc.measure(0, 0)

    sim = AerSimulator()
    compiled = transpile(qc, sim)
    job = sim.run(compiled, shots=shots)
    counts = job.result().get_counts()

    p0 = counts.get("0", 0) / shots
    # Im(<psi|U|psi>) = 2*P(0) - 1
    return 2 * p0 - 1


def run_hadamard_test():
    re_measured = hadamard_test_real()
    im_measured = hadamard_test_imag()

    # Analytical values
    re_exact = (1 + np.cos(np.pi / 4)) / 2
    im_exact = np.sin(np.pi / 4) / 2

    print("Hadamard test: estimating <+|T|+>")
    print(f"  Re(<+|T|+>): measured = {re_measured:.4f}, exact = {re_exact:.4f}")
    print(f"  Im(<+|T|+>): measured = {im_measured:.4f}, exact = {im_exact:.4f}")
    print(f"  Full value:  measured = {re_measured:.4f} + {im_measured:.4f}i")
    print(f"               exact    = {re_exact:.4f} + {im_exact:.4f}i")


if __name__ == "__main__":
    run_hadamard_test()

The measurements converge to the analytical values with statistical fluctuations that decrease as 1/√(shots). With 8192 shots, you should expect accuracy to about 0.01.

Iterative QPE: Full Qiskit Implementation

Standard QPE requires a circuit depth proportional to t * max_depth(U^(2^k)), which is deep. Iterative QPE (Kitaev’s phase estimation) trades depth for repeated single-qubit measurements.

Each iteration uses a single counting qubit and one controlled-U^(2^k) application, measuring to learn the k-th bit of θ (from least significant to most significant, with feedback). The circuit depth per iteration is constant. The total number of circuit runs is t (plus overhead for error correction and repetition).

Iterative QPE is the practical choice on near-term hardware because it avoids storing large quantum states across deep circuits.

Algorithm details

The iterative QPE algorithm extracts θ one bit at a time, starting from the least significant bit and working up. In round j (where j = t-1, t-2, ..., 0), the circuit:

  1. Prepares a single ancilla in |0⟩.
  2. Applies H to the ancilla.
  3. Applies controlled-U^(2^j) to the eigenstate.
  4. Applies a phase correction Rz(-ω_j) on the ancilla, where ω_j accounts for previously measured bits.
  5. Applies H to the ancilla.
  6. Measures the ancilla to obtain bit b_j.

The phase correction is:

ω_j = π * Σ_{k=j+1}^{t-1} b_k * 2^(j-k+1)

This correction removes the influence of already-determined bits so that each measurement isolates a single bit of θ.

Complete implementation

The following code implements iterative QPE for the phase gate P(2πθ) with configurable θ:

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator


def iterative_qpe(theta: float, t: int, shots_per_round: int = 1) -> float:
    """
    Iterative (Kitaev) QPE for U = diag(1, e^(2*pi*i*theta)).
    Extracts t bits of theta using a single ancilla qubit per round.

    Parameters:
        theta: true phase in [0, 1)
        t: number of bits of precision
        shots_per_round: shots per circuit execution (1 for deterministic,
                         more for majority voting on noisy hardware)

    Returns:
        Estimated phase as a float in [0, 1).
    """
    sim = AerSimulator()
    bits = [0] * t  # bits[j] stores bit j (j=0 is most significant)

    # Extract bits from most significant (j=t-1) to least significant (j=0)
    # But the standard iterative protocol goes from least significant to most.
    # Here we extract bit j of the binary fraction 0.b_{t-1} b_{t-2} ... b_0.
    # Round j extracts bit b_j using controlled-U^(2^j).

    measured_bits = []  # measured_bits[k] = bit from round k

    for j in range(t - 1, -1, -1):
        qc = QuantumCircuit(2, 1)

        # Prepare eigenstate |1> on qubit 1
        qc.x(1)

        # Hadamard on ancilla (qubit 0)
        qc.h(0)

        # Controlled-U^(2^j): phase gate with angle 2^j * 2*pi*theta
        angle = 2**j * 2 * np.pi * theta
        qc.cp(angle, 0, 1)

        # Phase correction from previously measured (less significant) bits
        omega = 0.0
        for k, prev_bit in enumerate(measured_bits):
            # Each previously measured bit b_k contributes to the correction.
            # The correction for round j given bit from round (j + k + 1):
            omega += prev_bit * np.pi / (2 ** (k + 1))

        if abs(omega) > 1e-12:
            qc.p(-omega, 0)

        # Hadamard on ancilla
        qc.h(0)

        # Measure
        qc.measure(0, 0)

        compiled = transpile(qc, sim)

        if shots_per_round == 1:
            job = sim.run(compiled, shots=1)
            result = job.result().get_counts()
            bit = int(list(result.keys())[0])
        else:
            # Majority voting for noisy scenarios
            job = sim.run(compiled, shots=shots_per_round)
            result = job.result().get_counts()
            zeros = result.get("0", 0)
            ones = result.get("1", 0)
            bit = 0 if zeros >= ones else 1

        measured_bits.insert(0, bit)

    # Convert measured bits to phase
    # measured_bits is now [b_{t-1}, b_{t-2}, ..., b_0]
    # theta_est = sum_j b_j / 2^(t-j)
    theta_est = 0.0
    for j, b in enumerate(measured_bits):
        theta_est += b / (2 ** (j + 1))

    return theta_est


def run_iterative_qpe():
    test_cases = [
        (0.125, 3, "1/8 (exact on 3-bit grid)"),
        (0.25, 4, "1/4 (exact on 4-bit grid)"),
        (0.1, 6, "0.1 (non-exact, 6 bits)"),
        (0.333, 8, "~1/3 (non-exact, 8 bits)"),
    ]

    for theta, t, desc in test_cases:
        theta_est = iterative_qpe(theta, t)
        error = abs(theta_est - theta)
        print(f"theta = {theta} ({desc})")
        print(f"  Estimated: {theta_est:.6f}, Error: {error:.6f}")
        print(f"  Max possible error with {t} bits: {1/2**t:.6f}")
        print()


if __name__ == "__main__":
    run_iterative_qpe()

For exact phases (like θ = 1/8 with t = 3), iterative QPE always returns the correct answer. For non-exact phases, the estimate rounds to the nearest representable value, with accuracy limited by the number of bits t.

Application: Eigenvalue Estimation for Chemistry

In quantum chemistry, the Hamiltonian H of a molecule is decomposed into Pauli terms. The ground state energy is the smallest eigenvalue of H. If you can prepare the ground state (via VQE, for example) and construct U = e^(-iHt) as a quantum circuit (Trotterization), QPE estimates the energy E from the phase θ = Et / (2π).

For hydrogen at equilibrium bond length, t = 8 to 12 counting qubits yields energies accurate to milli-Hartree precision, sufficient for chemical accuracy.

Qiskit implementation: H2 ground state energy

The following code constructs a simplified H2 Hamiltonian as a SparsePauliOp, builds the Trotter circuit for the time evolution operator, and runs QPE to estimate the ground state energy.

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT
from qiskit.quantum_info import SparsePauliOp


def build_h2_hamiltonian() -> SparsePauliOp:
    """
    Simplified H2 Hamiltonian at equilibrium bond length (0.735 Angstrom)
    in the STO-3G basis using the Bravyi-Kitaev transformation.
    H = g0*II + g1*ZI + g2*IZ + g3*ZZ + g4*XX + g5*YY
    Coefficients from Ref: O'Malley et al., PRX 6, 031007 (2016).
    """
    g0 = -0.4804
    g1 = +0.3435
    g2 = -0.4347
    g3 = +0.5716
    g4 = +0.0910
    g5 = +0.0910

    hamiltonian = SparsePauliOp.from_list([
        ("II", g0),
        ("ZI", g1),
        ("IZ", g2),
        ("ZZ", g3),
        ("XX", g4),
        ("YY", g5),
    ])
    return hamiltonian


def trotter_step(qc: QuantumCircuit, q0: int, q1: int,
                 coeffs: dict, dt: float):
    """
    First-order Trotter step for H2 Hamiltonian.
    Implements exp(-i * H * dt) approximately.
    """
    # ZI term: exp(-i * g1 * dt * Z_0) = Rz(2 * g1 * dt) on q0
    qc.rz(2 * coeffs["ZI"] * dt, q0)

    # IZ term: Rz on q1
    qc.rz(2 * coeffs["IZ"] * dt, q1)

    # ZZ term: exp(-i * g3 * dt * Z0 Z1)
    # Decompose as CNOT, Rz, CNOT
    qc.cx(q0, q1)
    qc.rz(2 * coeffs["ZZ"] * dt, q1)
    qc.cx(q0, q1)

    # XX term: exp(-i * g4 * dt * X0 X1)
    # Basis change: H on both, then ZZ-type, then H on both
    qc.h(q0)
    qc.h(q1)
    qc.cx(q0, q1)
    qc.rz(2 * coeffs["XX"] * dt, q1)
    qc.cx(q0, q1)
    qc.h(q0)
    qc.h(q1)

    # YY term: exp(-i * g5 * dt * Y0 Y1)
    # Basis change: Sdg H on both, then ZZ-type, then H S on both
    qc.sdg(q0)
    qc.sdg(q1)
    qc.h(q0)
    qc.h(q1)
    qc.cx(q0, q1)
    qc.rz(2 * coeffs["YY"] * dt, q1)
    qc.cx(q0, q1)
    qc.h(q0)
    qc.h(q1)
    qc.s(q0)
    qc.s(q1)


def qpe_h2(t: int = 4, n_trotter: int = 4):
    """
    QPE for H2 ground state energy estimation.
    Uses t counting qubits and n_trotter Trotter steps per controlled-U.
    """
    coeffs = {"ZI": 0.3435, "IZ": -0.4347, "ZZ": 0.5716,
              "XX": 0.0910, "YY": 0.0910}
    g0 = -0.4804  # identity coefficient (energy offset)

    # Evolution time chosen so that ground state phase is distinguishable
    evolution_time = 1.0

    counting = QuantumRegister(t, name="count")
    system = QuantumRegister(2, name="sys")
    cr = ClassicalRegister(t, name="c")
    qc = QuantumCircuit(counting, system, cr)

    # Prepare approximate ground state of H2
    # For H2, the ground state is approximately cos(alpha)|01> + sin(alpha)|10>
    # with alpha ~ 0.1 rad. A simple Ry + CNOT ansatz:
    alpha = -0.0986  # approximate optimal VQE angle
    qc.x(system[0])
    qc.ry(2 * alpha, system[1])
    qc.cx(system[1], system[0])

    # Hadamard on counting qubits
    qc.h(counting)

    # Controlled Trotter evolution: controlled-U^(2^k) for each counting qubit
    for k in range(t):
        n_reps = 2**k * n_trotter
        dt = evolution_time / n_trotter
        for _ in range(n_reps):
            # Controlled Trotter step from counting[k]
            # We wrap each rotation in a controlled version
            # For simplicity, use a controlled approach via ancilla
            trotter_step(qc, system[0], system[1], coeffs, dt / (2**k))

    # Inverse QFT
    iqft = QFT(t, inverse=True, do_swaps=True)
    qc.compose(iqft, qubits=counting, inplace=True)

    qc.measure(counting, cr)

    # Run simulation
    sim = AerSimulator()
    compiled = transpile(qc, sim, optimization_level=2)
    job = sim.run(compiled, shots=4096)
    counts = job.result().get_counts()

    # Convert measurement outcomes to energy estimates
    print(f"QPE for H2 with t={t} counting qubits, {n_trotter} Trotter steps")
    print(f"Expected ground state energy: ~ -1.137 Hartree")
    print()

    for bitstring, count in sorted(counts.items(), key=lambda x: -x[1])[:5]:
        m = int(bitstring, 2)
        theta = m / 2**t
        # Energy from phase: E = 2*pi*theta / evolution_time + g0
        # More precisely: U = e^(-iHt), eigenvalue = e^(-iEt)
        # QPE measures theta where e^(2*pi*i*theta) = e^(-iEt)
        # So theta = -Et/(2*pi) mod 1, meaning E = -2*pi*theta/t + g0 adjustment
        phase_energy = 2 * np.pi * theta / evolution_time
        energy = -phase_energy + g0
        print(f"  m={m:2d}, theta={theta:.4f}, E = {energy:.4f} Hartree ({count} shots)")

    print()
    print("Note: accuracy depends on Trotter error and state preparation quality.")
    print("With more counting qubits and Trotter steps, the estimate converges")
    print("toward the exact value of -1.137 Hartree.")


if __name__ == "__main__":
    qpe_h2()

The phase-to-energy conversion follows from the relationship U = e^(-iHt). The eigenvalue of U is e^(-iEt), and QPE measures θ where e^(2πiθ) = e^(-iEt). Solving gives E = -2πθ/t. The identity coefficient g0 contributes an overall energy offset that must be added back.

The accuracy of this estimate depends on three factors: (1) the quality of the initial state preparation (how close to the true ground state), (2) the Trotter error (controlled by n_trotter), and (3) the QPE precision (controlled by t). Increasing any of these improves accuracy at the cost of deeper circuits.

Application: QPE Inside HHL

The HHL algorithm (Harrow-Hassidim-Lloyd) solves linear systems Ax = b in quantum superposition. It requires estimating the eigenvalues of A to invert them. The subroutine structure is:

  1. Phase-encode |b⟩ into the eigenstate register.
  2. Run QPE to copy eigenvalue estimates into an ancilla register.
  3. Apply controlled rotations to invert eigenvalues (rotation angle = 1/λ).
  4. Uncompute QPE (run QPE†).
  5. Measure success ancilla.

The precision of QPE directly determines the accuracy of HHL. Poor phase estimation means poorly inverted eigenvalues and a wrong solution vector.

QPE substructure in HHL: code sketch

The following code illustrates the QPE and eigenvalue inversion steps for a 2x2 system. Consider A with eigenvalues λ_1 = 1 and λ_2 = 2. The HHL circuit uses QPE to encode these eigenvalues in a counting register, then applies controlled-Ry rotations proportional to 1/λ to an ancilla qubit.

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT


def hhl_demo():
    """
    Simplified HHL circuit for a 2x2 matrix A with eigenvalues 1 and 2.
    Uses 2 counting qubits for QPE, 1 system qubit, 1 ancilla for inversion.

    Circuit layout:
    - qubits 0-1: counting register (QPE)
    - qubit 2: system register (|b>)
    - qubit 3: ancilla for eigenvalue inversion
    """
    t = 2  # counting qubits
    counting = QuantumRegister(t, name="count")
    system = QuantumRegister(1, name="sys")
    ancilla = QuantumRegister(1, name="anc")
    cr = ClassicalRegister(1, name="c")
    qc = QuantumCircuit(counting, system, ancilla, cr)

    # Prepare |b> state on system qubit
    # For demonstration, |b> = |1> (an eigenstate of our simple A)
    qc.x(system[0])

    # === QPE forward ===
    # Hadamard on counting qubits
    qc.h(counting[0])
    qc.h(counting[1])

    # Controlled-U^(2^k) for A with eigenvalues 1 and 2
    # U = e^(2*pi*i*A*t0) where t0 is chosen so eigenvalue phases are
    # theta_1 = 0.25 (for lambda=1) and theta_2 = 0.50 (for lambda=2)
    # with 2 counting qubits

    # Controlled-U (k=0): phase of 2*pi*theta for each eigenvalue
    qc.cp(np.pi / 2, counting[0], system[0])  # eigenvalue-dependent phase

    # Controlled-U^2 (k=1): phase of 2*2*pi*theta
    qc.cp(np.pi, counting[1], system[0])

    # Inverse QFT on counting register
    iqft = QFT(t, inverse=True, do_swaps=True)
    qc.compose(iqft, qubits=counting, inplace=True)

    qc.barrier()

    # === Eigenvalue inversion ===
    # Counting register now holds |m> where m encodes lambda.
    # Apply controlled-Ry(2*arcsin(C/lambda)) to ancilla.
    # C is a normalization constant (C <= min eigenvalue).
    C = 0.5

    # If counting = |01> (m=1, lambda=1): Ry(2*arcsin(0.5/1))
    angle_1 = 2 * np.arcsin(C / 1.0)
    # If counting = |10> (m=2, lambda=2): Ry(2*arcsin(0.5/2))
    angle_2 = 2 * np.arcsin(C / 2.0)

    # Controlled rotation conditioned on counting register value
    # m=1: counting[1]=0, counting[0]=1
    qc.x(counting[1])  # flip so both are 1 when m=1
    qc.mcry(angle_1, [counting[0], counting[1]], ancilla[0])
    qc.x(counting[1])

    # m=2: counting[1]=1, counting[0]=0
    qc.x(counting[0])  # flip so both are 1 when m=2
    qc.mcry(angle_2, [counting[0], counting[1]], ancilla[0])
    qc.x(counting[0])

    qc.barrier()

    # === QPE inverse (uncompute) ===
    qft = QFT(t, inverse=False, do_swaps=True)
    qc.compose(qft, qubits=counting, inplace=True)

    # Undo controlled unitaries
    qc.cp(-np.pi, counting[1], system[0])
    qc.cp(-np.pi / 2, counting[0], system[0])

    qc.h(counting[0])
    qc.h(counting[1])

    # Measure ancilla: post-select on |1> to get solution
    qc.measure(ancilla, cr)

    # Run
    sim = AerSimulator()
    compiled = transpile(qc, sim)
    job = sim.run(compiled, shots=4096)
    counts = job.result().get_counts()

    print("HHL demo: 2x2 system with eigenvalues 1 and 2")
    print(f"Ancilla measurement results: {counts}")
    p_success = counts.get("1", 0) / 4096
    print(f"Success probability (ancilla=1): {p_success:.4f}")
    print(f"Expected success probability: C^2 * (1/lambda_1^2 + 1/lambda_2^2) / 2")
    print(f"  = {C**2 * (1/1 + 1/4) / 2:.4f}")
    print()
    print("The QPE precision (t=2 here) determines the smallest eigenvalue")
    print("that can be resolved and inverted. With t counting qubits, the")
    print("condition number kappa that HHL can handle is at most 2^t.")


if __name__ == "__main__":
    hhl_demo()

The QPE precision directly constrains the condition number that HHL can handle. With t counting qubits, eigenvalues can only be resolved to multiples of 1/2^t. If two eigenvalues are closer than this resolution, they collapse to the same estimate, and the inversion step applies the wrong angle. For a matrix with condition number κ = λ_max / λ_min, you need t ≥ log2(κ) counting qubits to resolve all eigenvalues.

Application: Period Finding in Shor’s Algorithm

Shor’s algorithm uses QPE to find the period r of f(x) = a^x mod N. The unitary is U|x⟩ = |ax mod N⟩. Its eigenvalues are e^(2πi k/r) for k = 0, ..., r-1. QPE applied to a uniform superposition of eigenstates yields measurements close to multiples of 1/r, and continued fraction expansion recovers r.

This is the exact same QPE circuit as above, with U replaced by modular exponentiation. The computational difficulty of implementing U efficiently is where most of Shor’s circuit complexity lives.

Approximate QFT

The standard QFT on n qubits uses controlled-R_k rotation gates for all k from 2 to n, where R_k = diag(1, e^(2πi/2^k)). The small-angle rotations (large k) contribute phases of 2π/2^k, which become negligibly small for large k. Truncating the QFT by omitting rotations with k > m reduces the two-qubit gate count from O(n²) to O(n * m) while introducing only small fidelity loss.

Why truncation works

The rotation R_k adds a phase of 2π/2^k conditioned on a distant qubit. For k = 10, this phase is 2π/1024 ≈ 0.006 radians. On noisy hardware, the gate error from implementing this rotation often exceeds the error from omitting it entirely. The approximate QFT (AQFT) takes advantage of this by keeping only the m largest rotations per qubit.

Qiskit implementation: AQFT vs full QFT fidelity comparison

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.quantum_info import Statevector, state_fidelity


def full_qft(n: int) -> QuantumCircuit:
    """Standard QFT on n qubits (no truncation)."""
    qc = QuantumCircuit(n, name=f"QFT_{n}")

    for i in range(n):
        qc.h(i)
        for j in range(i + 1, n):
            k = j - i + 1
            qc.cp(2 * np.pi / 2**k, j, i)

    # Swap qubits to match standard QFT output ordering
    for i in range(n // 2):
        qc.swap(i, n - 1 - i)

    return qc


def approximate_qft(n: int, m: int) -> QuantumCircuit:
    """
    Approximate QFT on n qubits, keeping only rotations R_k with k <= m.
    When m >= n, this is identical to the full QFT.
    """
    qc = QuantumCircuit(n, name=f"AQFT_{n}_m{m}")

    for i in range(n):
        qc.h(i)
        for j in range(i + 1, min(i + m, n)):
            k = j - i + 1
            if k <= m:
                qc.cp(2 * np.pi / 2**k, j, i)

    # Swap qubits
    for i in range(n // 2):
        qc.swap(i, n - 1 - i)

    return qc


def compare_aqft_fidelity():
    """
    Compare fidelity of AQFT vs full QFT for various truncation depths.
    Tests on a specific input state (not |0>, which is trivial).
    """
    n_qubits = 8

    # Prepare a non-trivial input state: apply some rotations
    prep = QuantumCircuit(n_qubits)
    for i in range(n_qubits):
        prep.ry(0.3 * (i + 1), i)
    prep.cx(0, 1)
    prep.cx(2, 3)

    # Get exact QFT output
    full_circ = prep.compose(full_qft(n_qubits))
    sv_full = Statevector.from_instruction(full_circ)

    print(f"AQFT fidelity comparison (n = {n_qubits} qubits)")
    print(f"{'Truncation m':>14}  {'2Q gates':>10}  {'Fidelity':>10}  {'Gate reduction':>15}")
    print("-" * 55)

    full_gates = n_qubits * (n_qubits - 1) // 2
    for m in range(1, n_qubits + 1):
        approx_circ = prep.compose(approximate_qft(n_qubits, m))
        sv_approx = Statevector.from_instruction(approx_circ)
        fid = state_fidelity(sv_full, sv_approx)

        # Count approximate 2Q gates
        approx_gates = sum(min(m - 1, n_qubits - i - 1) for i in range(n_qubits))
        reduction = 1 - approx_gates / full_gates

        print(f"{m:14d}  {approx_gates:10d}  {fid:10.6f}  {reduction:14.1%}")


if __name__ == "__main__":
    compare_aqft_fidelity()

The output shows that even with m = 3 or m = 4 (keeping only the 3 or 4 nearest-neighbor rotations per qubit), the fidelity remains above 0.99 for typical states. For QPE applications, this means you can significantly reduce the inverse QFT circuit depth with negligible impact on the phase estimate quality. On NISQ hardware, the reduced gate count more than compensates for the small approximation error.

Noise Impact and Mitigation

On real hardware, QPE degrades due to:

  • Gate errors accumulating in the controlled-U^(2^k) chains. The k=t-1 term requires 2^(t-1) applications of U, each adding error.
  • Decoherence over the long circuit depth.
  • QFT fidelity decreasing with register size.

Iterative QPE, combined with quantum error mitigation techniques like zero-noise extrapolation, extends QPE’s practical reach. For problems requiring more than 5-7 bits of precision, fault-tolerant hardware is the realistic requirement.

Noisy simulation in Qiskit

The following code simulates QPE on a noisy backend and demonstrates how noise shifts the measurement distribution away from the correct answer.

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, transpile
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from qiskit.circuit.library import QFT


def qpe_circuit(theta: float, t: int) -> QuantumCircuit:
    """Build a QPE circuit for U = diag(1, e^(2*pi*i*theta))."""
    counting = QuantumRegister(t, name="count")
    eigenstate = QuantumRegister(1, name="eig")
    cr = ClassicalRegister(t, name="c")
    qc = QuantumCircuit(counting, eigenstate, cr)

    qc.x(eigenstate[0])
    qc.h(counting)

    for k in range(t):
        angle = 2**k * 2 * np.pi * theta
        qc.cp(angle, counting[k], eigenstate[0])

    iqft = QFT(t, inverse=True, do_swaps=True)
    qc.compose(iqft, qubits=counting, inplace=True)
    qc.measure(counting, cr)

    return qc


def run_noisy_qpe():
    theta = 0.125  # exact on 3-bit grid
    t = 3
    shots = 8192

    qc = qpe_circuit(theta, t)

    # Ideal simulation
    ideal_sim = AerSimulator()
    compiled_ideal = transpile(qc, ideal_sim)
    counts_ideal = ideal_sim.run(compiled_ideal, shots=shots).result().get_counts()

    # Noisy simulation with varying error rates
    error_rates = [0.001, 0.005, 0.01, 0.02, 0.05]

    print(f"QPE noise analysis: theta = {theta}, t = {t}")
    print(f"Correct outcome: m = {int(theta * 2**t)}")
    print()
    print(f"{'Error rate':>12}  {'P(correct)':>12}  {'Phase error':>12}")
    print("-" * 40)

    # Ideal result
    correct_bitstring = format(int(theta * 2**t), f"0{t}b")
    p_correct_ideal = counts_ideal.get(correct_bitstring, 0) / shots
    print(f"{'ideal':>12}  {p_correct_ideal:12.4f}  {0.0:12.6f}")

    for p_err in error_rates:
        # Build noise model
        noise_model = NoiseModel()
        error_1q = depolarizing_error(p_err, 1)
        error_2q = depolarizing_error(p_err * 2, 2)
        noise_model.add_all_qubit_quantum_error(error_1q, ["h", "rz", "x", "p"])
        noise_model.add_all_qubit_quantum_error(error_2q, ["cp", "cx"])

        noisy_sim = AerSimulator(noise_model=noise_model)
        compiled_noisy = transpile(qc, noisy_sim)
        counts_noisy = noisy_sim.run(compiled_noisy, shots=shots).result().get_counts()

        # Compute most likely outcome and phase error
        best_bitstring = max(counts_noisy, key=counts_noisy.get)
        m_measured = int(best_bitstring, 2)
        theta_measured = m_measured / 2**t
        phase_error = abs(theta_measured - theta)

        p_correct = counts_noisy.get(correct_bitstring, 0) / shots
        print(f"{p_err:12.3f}  {p_correct:12.4f}  {phase_error:12.6f}")

    print()
    print("As gate error rates increase, the probability of measuring the correct")
    print("outcome decreases and eventually the peak shifts to an incorrect bin.")
    print("For 1% gate error, QPE with t=3 still works reasonably well.")
    print("Beyond 5% gate error, the signal is largely destroyed.")


if __name__ == "__main__":
    run_noisy_qpe()

The table shows that QPE is moderately robust to small gate errors (below 1%) but degrades rapidly for larger error rates. At 5% per-gate error, the correct outcome is no longer the most probable measurement, and the phase estimate becomes unreliable.

Zero-noise extrapolation (ZNE)

ZNE mitigates noise by intentionally amplifying it and then extrapolating back to the zero-noise limit. The simplest approach is unitary folding: replacing each gate G with G G† G to triple the effective noise. By running circuits at noise factors 1x, 3x, and 5x, a linear or polynomial extrapolation estimates the noise-free expectation value.

For QPE, ZNE can be applied to the probability of each measurement outcome. After extrapolation, the corrected distribution is re-normalized and the peak identified. This partial correction often recovers the correct bin even when the raw noisy distribution is ambiguous.

On current NISQ hardware (gate errors around 0.1% to 1%), ZNE can extend QPE precision by roughly 1 to 2 additional bits before the extrapolation becomes unreliable.

Scaling Comparison: QPE Variants

The following table compares the three main QPE approaches across key metrics:

MetricStandard QPEIterative QPERandomized QPE
Ancilla qubitst11
Circuit depthO(2^t * d_U)O(2^t * d_U) per round, but only one round active at a timeO(2^k * d_U) per sample, random k
Total circuits1tO(t / ε)
Total shotsO(1) for exact phasest (1 shot per round, or more for noise)O(log(1/δ) / ε²)
Classical post-processingBinary fraction readoutBit accumulation with feedbackMedian-of-means or Bayesian estimation
Coherence requirementAll t+n qubits coherent simultaneouslyOnly 1+n qubits coherent per roundOnly 1+n qubits coherent per sample
NISQ suitabilityPoor (deep circuits, many qubits)Good (shallow per round)Good (shallow per sample)
Fault-tolerant suitabilityExcellent (single run)Good (but more total operations)Good (but more total operations)

Recommendation matrix

Quantum chemistry (energy estimation):

  • Near-term: iterative QPE or randomized QPE with error mitigation
  • Fault-tolerant: standard QPE with qubitized walk operator

Cryptographic period finding (Shor’s):

  • Standard QPE is the natural fit since the algorithm already requires fault-tolerant hardware for meaningful problem sizes
  • Iterative QPE can reduce qubit count at the cost of more circuit executions

Linear algebra (HHL and variants):

  • Standard QPE, since HHL requires QPE as a coherent subroutine (no mid-circuit measurement and classical feedback in the basic version)
  • For NISQ implementations of HHL-like algorithms, variational approaches often replace QPE entirely

Common Mistakes

This section catalogs errors that frequently appear in QPE implementations, along with explanations and fixes.

1. Using the wrong eigenstate

The T gate has two eigenstates: |0⟩ with eigenvalue 1 (phase θ = 0) and |1⟩ with eigenvalue e^(iπ/4) (phase θ = 1/8). A common mistake is preparing |0⟩ instead of |1⟩ as the eigenstate:

# WRONG: this measures theta = 0 (eigenvalue 1), not theta = 1/8
qc.x(eigenstate[0])  # <-- this line is missing

Without the X gate, the eigenstate register starts in |0⟩, and since T|0⟩ = |0⟩, QPE correctly returns θ = 0. The circuit is technically correct; it just measures the wrong eigenvalue. Always verify which eigenstate you are preparing and which eigenvalue it corresponds to.

2. QFT direction confusion

QPE requires the inverse QFT, not the forward QFT. The forward QFT encodes computational basis states into phase patterns; the inverse QFT decodes phase patterns back to computational basis states. Applying the forward QFT instead of the inverse produces a scrambled output.

In Qiskit’s QFT class, set inverse=True:

# CORRECT
iqft = QFT(t, inverse=True, do_swaps=True)

# WRONG: forward QFT, not inverse
qft = QFT(t, inverse=False, do_swaps=True)

3. Off-by-one in the binary fraction

The measured integer m from t counting qubits corresponds to the phase θ = m / 2^t, not θ = m / 2^(t-1). With t = 3 counting qubits and measurement outcome m = 1:

# CORRECT
theta = m / 2**t  # theta = 1/8 = 0.125

# WRONG
theta = m / 2**(t - 1)  # theta = 1/4 = 0.25 (off by factor of 2)

This off-by-one error is surprisingly common and produces phase estimates that are systematically doubled.

4. Qiskit bit ordering (little-endian)

Qiskit uses little-endian bit ordering for classical registers. The bitstring "001" in Qiskit represents the integer 1 (rightmost bit is most significant in the displayed string, but corresponds to classical bit 0). When converting measurement bitstrings to integers:

# Qiskit bitstring "001":
# Bit 0 (rightmost in display) = 1, Bit 1 = 0, Bit 2 = 0
# Integer value = 1*2^0 + 0*2^1 + 0*2^2 = 1
m = int(bitstring, 2)  # This works correctly because Python's int()
                        # reads the string left-to-right as MSB-first,
                        # which matches Qiskit's display convention.

The confusion arises because Qiskit’s qubit ordering is q[0] at the bottom of the circuit diagram (least significant), but the classical bit display puts c[0] at the rightmost position. If your counting register maps qubit k to classical bit k, the int(bitstring, 2) conversion works directly. Problems arise when the qubit-to-classical-bit mapping is scrambled.

5. Insufficient shots for non-exact phases

For phases that do not lie on the t-bit grid, QPE produces a probability distribution (not a deterministic outcome). A single shot returns one sample from this distribution, which may not be the most probable outcome. For θ = 0.1 with t = 4, the correct bin (m = 2) has only about 62% probability. With a single shot, there is a 38% chance of measuring an incorrect value.

Use enough shots to reliably identify the peak, and consider taking the most frequent outcome rather than any single measurement. For precision-critical applications, use at least O(1/P(m*)²) shots where P(m*) is the probability of the correct bin.

6. Wrong angle in the controlled-phase gate

The cp(angle) gate in Qiskit applies a phase of e^(i * angle). For QPE of a unitary with eigenvalue e^(2πiθ), the controlled-U^(2^k) gate should apply phase 2^k * 2πθ. A common error is using θ directly as the angle instead of 2πθ:

# CORRECT: angle is 2^k * 2*pi*theta
angle = 2**k * 2 * np.pi * theta
qc.cp(angle, counting[k], eigenstate[0])

# WRONG: using theta directly (missing the 2*pi factor)
angle = 2**k * theta
qc.cp(angle, counting[k], eigenstate[0])

For the T gate specifically, the eigenvalue phase is π/4 (not 2π * 1/8 = π/4; these happen to coincide). But for arbitrary θ, forgetting the factor produces completely wrong results.

Similarly, be careful not to double-count: if you define θ as the phase in e^(2πiθ) (so θ ∈ [0,1)), the angle for cp is 2^k * 2π * θ. If you define φ as the raw phase in e^(iφ) (so φ ∈ [0, 2π)), then the angle is 2^k * φ. Mixing these conventions is a recipe for factor-of-2π errors.

Was this tutorial helpful?