Quantum Fourier Transform in Qiskit

What the QFT Does

The quantum Fourier transform (QFT) applies the discrete Fourier transform to the amplitudes of a quantum state. Given a computational basis state |x> on n qubits, the QFT produces:

QFT|x> = (1 / sqrt(N)) * sum_{k=0}^{N-1} exp(2*pi*i*x*k / N) |k>

where N = 2^n. The output is a uniform superposition of all basis states, with phases encoding the frequency information of x.

This is the same operation as the classical discrete Fourier transform, but applied to quantum amplitudes rather than classical data arrays. A classical DFT takes a vector of N complex numbers and produces another vector of N complex numbers. The QFT does the same thing, but the input and output vectors are encoded as quantum amplitudes of an n-qubit register, and the entire transform is implemented as a unitary operation on those qubits.

Comparison to Classical FFT

The classical Fast Fourier Transform on N = 2^n points requires O(n * 2^n) operations. The QFT requires only O(n^2) gates.

That is an exponential improvement in circuit size. Unfortunately, you cannot directly read out all 2^n amplitudes from a quantum state without destroying it. Measurement collapses the state to a single sample. This means the QFT alone is not useful as a general-purpose Fourier transform; its speedup only shows up when it is used as a subroutine inside a larger algorithm that extracts specific frequency information through interference.

The two algorithms that exploit this most powerfully are Shor’s factoring algorithm and quantum phase estimation (QPE). Both arrange the computation so that the Fourier-transformed state has high probability concentrated on a small number of outcomes. A single measurement then reveals the desired information.

Deriving the Product Formula

The key insight behind the QFT circuit is that the Fourier transform of a basis state factorizes into a tensor product of single-qubit states. This factorization is what makes the O(n^2) gate count possible.

Start from the definition. Let omega = exp(2pii / N) where N = 2^n:

QFT|j> = (1 / sqrt(N)) * sum_{k=0}^{N-1} omega^{jk} |k>

Write k in binary as k = k_{n-1} * 2^{n-1} + k_{n-2} * 2^{n-2} + … + k_0 * 2^0. Then |k> = |k_{n-1}> x |k_{n-2}> x … x |k_0>, and the sum over k becomes a product of independent sums over each k_m in {0, 1}:

QFT|j> = (1 / sqrt(N)) * prod_{m=0}^{n-1} sum_{k_m=0}^{1} exp(2*pi*i * j * k_m * 2^m / 2^n) |k_m>

Each factor in this product has the form:

|0> + exp(2*pi*i * j / 2^{n-m}) |1>

Now use the binary fraction notation. Write j = j_{n-1} * 2^{n-1} + … + j_0 * 2^0. Then j / 2^{n-m} has an integer part (which contributes a phase of zero modulo 2*pi) and a fractional part involving only the bits j_{n-m-1}, j_{n-m-2}, …, j_0. In binary fraction notation, 0.j_{n-m-1} j_{n-m-2} … j_0.

The result is:

QFT|j> = (1/sqrt(N)) *
  (|0> + e^{2*pi*i * 0.j_0} |1>) x
  (|0> + e^{2*pi*i * 0.j_1 j_0} |1>) x
  ...
  (|0> + e^{2*pi*i * 0.j_{n-1} j_{n-2} ... j_0} |1>)

Each tensor factor depends on a different subset of the input bits. The first factor depends only on j_0. The second depends on j_1 and j_0. The m-th factor depends on j_{m-1}, j_{m-2}, …, j_0.

This is the blueprint for the circuit: a Hadamard gate on qubit m creates the (|0> + e^{…} |1>) superposition, and controlled phase rotations from the other qubits imprint the correct fractional binary phases.

Circuit Construction

The QFT on n qubits is built from two types of gates repeated in a specific pattern:

Hadamard gate: Puts one qubit into superposition, creating the (|0> + |1>) / sqrt(2) factor. The phase will then be refined by the controlled rotations that follow.

Controlled phase rotation CP(theta): For qubit j controlled by qubit k, applies a phase of exp(i * theta) to the |1> state of qubit j, conditioned on qubit k being |1>. In the QFT, theta = 2*pi / 2^(k-j+1). These rotations encode the fractional binary representation of the input.

For a 3-qubit QFT, the construction goes qubit by qubit:

q0: H -- CP(pi/2) -- CP(pi/4) --------- SWAP --
              |            |                |
q1:          ctrl    |     |    H -- CP(pi/2)  |
              |      |                |      |
q2:                 ctrl              ctrl  SWAP
                                            |
                                     H -----

The circuit processes qubit 0 first: apply H, then controlled rotations from qubits 1 and 2. Then qubit 1: apply H, then a controlled rotation from qubit 2. Finally qubit 2 gets just an H. After all rotations, SWAP gates reverse the qubit ordering.

Why the SWAPs Are Needed (Bit-Reversal)

The product formula derived above produces the output with qubits in reverse order relative to the standard binary convention. The first tensor factor (depending only on j_0) ends up on qubit 0 (the least significant bit in Qiskit’s convention), but it represents the most significant bit of the frequency output. The SWAPs correct this mismatch.

To see this concretely, consider what happens without the SWAPs for N = 8. The frequency index k should map to basis state |k>, but without bit-reversal, index k maps to |bit_reverse(k)>. The bit-reversal permutation for 3 bits is:

k (decimal)  k (binary)  bit_reverse(k) (binary)  bit_reverse(k) (decimal)
0            000         000                       0
1            001         100                       4
2            010         010                       2
3            011         110                       6
4            100         001                       1
5            101         101                       5
6            110         011                       3
7            111         111                       7

Without SWAPs, the QFT output is a valid Fourier transform but with scrambled frequency indices. Including the SWAPs produces the standard ordering that matches numpy’s FFT output directly.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_no_swaps(n: int) -> QuantumCircuit:
    """QFT without the final bit-reversal SWAPs."""
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            angle = 2 * np.pi / (2 ** (k - j + 1))
            qc.cp(angle, k, j)
    return qc

def qft_circuit(n: int) -> QuantumCircuit:
    """Build an n-qubit QFT circuit with SWAPs."""
    qc = qft_no_swaps(n)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

# Prepare |5> = |101>
n = 3
init = QuantumCircuit(n)
init.x(0)
init.x(2)

# QFT with SWAPs
sv_with = Statevector(init.compose(qft_circuit(n)))

# QFT without SWAPs
sv_without = Statevector(init.compose(qft_no_swaps(n)))

# Compare to numpy FFT
delta = np.zeros(2**n)
delta[5] = 1.0
numpy_fft = np.fft.fft(delta) / np.sqrt(2**n)

print("k   QFT(with swaps)          QFT(no swaps)            numpy FFT")
for k in range(2**n):
    a = sv_with.data[k]
    b = sv_without.data[k]
    c = numpy_fft[k]
    print(f"{k}   {a.real:+.4f}{a.imag:+.4f}i   "
          f"{b.real:+.4f}{b.imag:+.4f}i   "
          f"{c.real:+.4f}{c.imag:+.4f}i")

The QFT with SWAPs matches numpy’s FFT exactly (up to floating point precision). Without SWAPs, the amplitudes are the same set of values but assigned to different basis states according to the bit-reversal permutation.

Qiskit Implementation

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    """Build an n-qubit QFT circuit."""
    qc = QuantumCircuit(n)

    for j in range(n):
        # Hadamard on qubit j
        qc.h(j)
        # Controlled phase rotations from qubits j+1 through n-1
        for k in range(j + 1, n):
            angle = 2 * np.pi / (2 ** (k - j + 1))
            qc.cp(angle, k, j)

    # Swap qubits to correct bit ordering
    for i in range(n // 2):
        qc.swap(i, n - i - 1)

    return qc

# Build and draw the 3-qubit QFT
n = 3
qft = qft_circuit(n)
print(qft.draw(output="text"))

Gate-by-Gate Walkthrough: QFT of |6> = |110>

Abstract formulas become concrete when you trace the state through every gate. Here we apply the 3-qubit QFT to |6> (binary 110) and print the statevector after each gate.

In Qiskit’s little-endian convention, |6> = |110> means qubit 0 = 0, qubit 1 = 1, qubit 2 = 1. We set up the state accordingly.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def print_state(label, sv, n):
    """Print the statevector amplitudes."""
    print(f"\n--- {label} ---")
    for k in range(2**n):
        a = sv.data[k]
        mag = abs(a)
        if mag > 1e-10:
            phase_deg = np.degrees(np.angle(a))
            print(f"  |{k:03b}> : {a.real:+.5f}{a.imag:+.5f}i  "
                  f"(mag={mag:.5f}, phase={phase_deg:+.1f} deg)")

n = 3
# Prepare |6> = qubit 1 = 1, qubit 2 = 1 (little-endian)
qc = QuantumCircuit(n)
qc.x(1)
qc.x(2)
sv = Statevector(qc)
print_state("Initial state |6> = |110>", sv, n)

# Step 1: H on qubit 0
qc.h(0)
sv = Statevector(qc)
print_state("After H on q0", sv, n)

# Step 2: CP(pi/2) controlled by q1 on target q0
qc.cp(np.pi / 2, 1, 0)
sv = Statevector(qc)
print_state("After CP(pi/2) q1->q0", sv, n)

# Step 3: CP(pi/4) controlled by q2 on target q0
qc.cp(np.pi / 4, 2, 0)
sv = Statevector(qc)
print_state("After CP(pi/4) q2->q0", sv, n)

# Step 4: H on qubit 1
qc.h(1)
sv = Statevector(qc)
print_state("After H on q1", sv, n)

# Step 5: CP(pi/2) controlled by q2 on target q1
qc.cp(np.pi / 2, 2, 1)
sv = Statevector(qc)
print_state("After CP(pi/2) q2->q1", sv, n)

# Step 6: H on qubit 2
qc.h(2)
sv = Statevector(qc)
print_state("After H on q2", sv, n)

# Step 7: SWAP q0 and q2
qc.swap(0, 2)
sv = Statevector(qc)
print_state("After SWAP q0<->q2 (final QFT output)", sv, n)

# Verify: all amplitudes should have magnitude 1/sqrt(8)
# and phases 2*pi*6*k/8 for k = 0, 1, ..., 7
print("\n--- Verification ---")
expected_mag = 1 / np.sqrt(8)
for k in range(8):
    expected_phase = 2 * np.pi * 6 * k / 8
    expected_amp = expected_mag * np.exp(1j * expected_phase)
    actual_amp = sv.data[k]
    match = np.isclose(actual_amp, expected_amp)
    print(f"  k={k}: expected phase = {np.degrees(expected_phase)%360:.1f} deg, "
          f"match = {match}")

At every step you can see the state grow from a simple two-term superposition (after the first H) into the full 8-term superposition with equal amplitudes and linearly spaced phases. The controlled phase gates progressively refine the phase of the |1> component on each qubit, and the Hadamards split each qubit into a superposition that carries those phases forward.

Applying QFT to a Basis State: |5> = |101>

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    """Build an n-qubit QFT circuit."""
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            angle = 2 * np.pi / (2 ** (k - j + 1))
            qc.cp(angle, k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

# Apply QFT to |5> = |101> and inspect the output amplitudes
n = 3
input_state = QuantumCircuit(n)
input_state.x(0)   # Set qubit 0 to |1>
input_state.x(2)   # Set qubit 2 to |1>  (|101> = 5 in little-endian)

full_circuit = input_state.compose(qft_circuit(n))
sv = Statevector(full_circuit)

print(f"QFT of |5> on {n} qubits (N={2**n}):")
print(f"{'State':>6}  {'Amplitude':>22}  {'|Amplitude|':>12}")
for i, amp in enumerate(sv.data):
    if abs(amp) > 1e-9:
        print(f"|{i:03b}>   {amp:>22.6f}   {abs(amp):.6f}")

# Verify: all amplitudes should have equal magnitude 1/sqrt(8)
expected_mag = 1 / np.sqrt(2 ** n)
print(f"\nExpected amplitude magnitude: {expected_mag:.6f}")

All output amplitudes have the same magnitude (1/sqrt(8) = 0.354), with phases that are multiples of 2pi5/8. This is the Fourier transform of the indicator function that is 1 at position 5 and 0 everywhere else.

Visualizing the QFT Output

The QFT of a basis state |x> produces amplitudes that lie on the complex unit circle with equal spacing. Each amplitude has magnitude 1/sqrt(N) and phase angle 2pix*k/N for frequency index k. These N points form a discrete helix on the Argand (complex) plane.

This visualization makes it clear that the QFT is encoding x in the spacing between phase angles, not in the magnitudes. Measuring the magnitudes gives a uniform distribution (no information). The information lives entirely in the relative phases, which is why the QFT is only useful inside algorithms that convert those phase differences into amplitude differences through interference.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

n = 3
N = 2**n
x = 5

# Get QFT amplitudes from Qiskit
init = QuantumCircuit(n)
for bit in range(n):
    if (x >> bit) & 1:
        init.x(bit)
sv = Statevector(init.compose(qft_circuit(n)))
qft_amps = sv.data

# Get the DFT of the delta function at x using numpy
delta = np.zeros(N)
delta[x] = 1.0
numpy_dft = np.fft.fft(delta) / np.sqrt(N)

# Compare
print("k   QFT amplitude              numpy DFT/sqrt(N)         match?")
for k in range(N):
    q = qft_amps[k]
    d = numpy_dft[k]
    match = np.isclose(q, d)
    print(f"{k}   {q.real:+.5f}{q.imag:+.5f}i   "
          f"{d.real:+.5f}{d.imag:+.5f}i   {match}")

# Print the phase angles to see the helical structure
print(f"\nPhase angles for QFT|{x}> (N={N}):")
print(f"Expected phase spacing: 2*pi*{x}/{N} = {2*np.pi*x/N:.4f} rad "
      f"= {360*x/N:.1f} deg")
for k in range(N):
    phase = np.angle(qft_amps[k])
    expected = (2 * np.pi * x * k / N) % (2 * np.pi)
    if expected > np.pi:
        expected -= 2 * np.pi
    print(f"  k={k}: phase = {np.degrees(phase):+8.2f} deg  "
          f"(expected {np.degrees(expected):+8.2f} deg)")

The QFT amplitudes match the normalized DFT of the delta function exactly. The phase angles increase by 2pi5/8 = 225 degrees per step, wrapping around the unit circle. This discrete helix is the quantum analog of a single-frequency sinusoid in classical signal processing.

QFT on an Arbitrary Superposition

The QFT is a linear operation, so it distributes over superpositions:

QFT(alpha|a> + beta|b>) = alpha * QFT|a> + beta * QFT|b>

When the input has multiple non-zero amplitudes, the QFT outputs interfere. This interference is exactly what makes the QFT useful inside Shor’s algorithm, where the input is a periodic superposition and the interference constructively reinforces certain frequencies while canceling others.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

n = 3
N = 2**n

# Create the state (1/sqrt(2)) * (|1> + |5>)
# This is a two-peaked state with spacing 4 = N/2, so it has period 2
psi = np.zeros(N, dtype=complex)
psi[1] = 1.0 / np.sqrt(2)
psi[5] = 1.0 / np.sqrt(2)

# Initialize this state on a quantum circuit
qc = QuantumCircuit(n)
qc.initialize(psi, range(n))
qc.compose(qft_circuit(n), inplace=True)

sv = Statevector(qc)

print("Input: (|1> + |5>) / sqrt(2)")
print("This state has two peaks separated by 4 = N/2, implying period r=2.\n")

print("QFT output probabilities:")
probs = np.abs(sv.data)**2
for k in range(N):
    bar = "#" * int(probs[k] * 40)
    print(f"  |{k:03b}> (k={k}): prob = {probs[k]:.4f}  {bar}")

print("\nThe probability concentrates at k = 0 and k = 4 = N/r.")
print("These are multiples of N/r = 8/2 = 4, confirming the QFT")
print("detects the period of the input state.")

The two-peaked input with peaks at positions 1 and 5 (separated by 4 = N/2) has period r = 2. The QFT concentrates all probability at multiples of N/r = 4, namely k = 0 and k = 4. This is the period-finding mechanism at the heart of Shor’s algorithm.

Inverse QFT

The inverse QFT (IQFT) undoes the transform. Its circuit is the mirror image of the QFT: apply the gates in reverse order, replace each controlled phase rotation with its conjugate (negate the angle), and move SWAP gates to the front.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

def iqft_circuit(n: int) -> QuantumCircuit:
    """Build an n-qubit inverse QFT circuit."""
    qft = qft_circuit(n)
    return qft.inverse()

iqft = iqft_circuit(n=3)
print(iqft.draw(output="text"))

# Verify: QFT followed by IQFT is the identity
qc = QuantumCircuit(3)
qc.x(0)
qc.x(2)  # Start with |5>
qc.compose(qft_circuit(3), inplace=True)
qc.compose(iqft_circuit(3), inplace=True)

sv = Statevector(qc)
print("\nAfter QFT then IQFT on |5>:")
for k in range(8):
    prob = abs(sv.data[k])**2
    if prob > 1e-9:
        print(f"  |{k:03b}>: probability = {prob:.6f}")

The IQFT is used in both Shor’s algorithm and QPE as the final step that converts a phase-encoded register back into a computational basis state whose value encodes the period or phase.

QFT for Period Finding (Shor’s Algorithm Connection)

The core of Shor’s algorithm creates a quantum state with a periodic amplitude pattern and then uses the QFT to identify the period. Here is a direct demonstration of this mechanism.

Consider a state with period r = 2 over N = 8 positions: the state has equal amplitude at positions 0, 2, 4, and 6, and zero amplitude elsewhere.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

n = 3
N = 2**n

# Create a periodic state with period r
for r in [2, 4]:
    psi = np.zeros(N, dtype=complex)
    for pos in range(0, N, r):
        psi[pos] = 1.0
    psi /= np.linalg.norm(psi)  # Normalize

    # Initialize and apply QFT
    qc = QuantumCircuit(n)
    qc.initialize(psi, range(n))
    qc.compose(qft_circuit(n), inplace=True)
    sv = Statevector(qc)

    print(f"Periodic state with r={r}: non-zero at positions "
          f"{list(range(0, N, r))}")
    print(f"QFT output (N/r = {N//r}, expect peaks at multiples of {N//r}):")
    probs = np.abs(sv.data)**2
    for k in range(N):
        bar = "#" * int(probs[k] * 40)
        print(f"  k={k}: prob = {probs[k]:.4f}  {bar}")
    print()

For period r = 2, the QFT output has all probability at k = 0 and k = 4 (multiples of N/r = 4). For period r = 4, the probability concentrates at k = 0 and k = 2 (multiples of N/r = 2). In Shor’s algorithm, measuring the output register gives a value s * N/r for some integer s. From multiple measurements, the continued fraction algorithm extracts r, and from r the factors of N follow.

The pattern is simple: the QFT converts a periodic time-domain signal (positions spaced by r) into a periodic frequency-domain signal (peaks spaced by N/r). This is the same duality that the classical DFT exhibits, but the quantum version computes it in O(n^2) gates instead of O(n * 2^n) operations.

Quantum Phase Estimation Using the QFT

Quantum phase estimation (QPE) estimates the eigenvalue of a unitary operator. Given a unitary U and its eigenstate |psi> with U|psi> = exp(2pii*phi)|psi>, QPE outputs a binary approximation of phi.

The circuit uses three ingredients: a register of ancilla qubits in superposition, controlled applications of U^{2^k} to the eigenstate, and the inverse QFT on the ancilla register.

Here is a complete example. The T gate has the matrix diag(1, exp(ipi/4)), so |1> is an eigenstate with eigenvalue exp(ipi/4). The phase is phi = 1/8, which in 3-bit binary is 0.001. QPE with 3 ancilla qubits should output |001> = 1.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector

def iqft_circuit(n: int) -> QuantumCircuit:
    """Build inverse QFT on n qubits."""
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc.inverse()

# QPE circuit: 3 ancilla qubits (0,1,2) + 1 eigenstate qubit (3)
n_ancilla = 3
qc = QuantumCircuit(n_ancilla + 1)

# Prepare eigenstate |1> on qubit 3
qc.x(3)

# Put ancilla qubits in superposition
for i in range(n_ancilla):
    qc.h(i)

# Controlled-U^{2^k}: controlled-T^{2^k} from ancilla k to qubit 3
# T = phase(pi/4), so T^{2^0} = T, T^{2^1} = S, T^{2^2} = Z
# In Qiskit: cp(angle, control, target)
qc.cp(np.pi / 4, 0, 3)   # Controlled-T (2^0 = 1 application)
qc.cp(np.pi / 2, 1, 3)   # Controlled-S = Controlled-T^2
qc.cp(np.pi, 2, 3)        # Controlled-Z = Controlled-T^4

# Apply inverse QFT to the ancilla register
iqft = iqft_circuit(n_ancilla)
qc.compose(iqft, qubits=range(n_ancilla), inplace=True)

# Get the statevector
sv = Statevector(qc)

# Measure probabilities of the ancilla qubits (trace out qubit 3)
print("QPE results (ancilla register probabilities):")
print("Target: phi = 1/8 = 0.001 in binary, so expect |001> = 1\n")

probs = np.abs(sv.data)**2
ancilla_probs = {}
for state_idx in range(2**(n_ancilla + 1)):
    # Ancilla qubits are 0,1,2; eigenstate qubit is 3
    ancilla_val = state_idx & 0b0111  # bits 0,1,2
    eigen_val = (state_idx >> 3) & 1   # bit 3
    ancilla_probs[ancilla_val] = ancilla_probs.get(ancilla_val, 0) + probs[state_idx]

for val in range(2**n_ancilla):
    p = ancilla_probs.get(val, 0)
    if p > 1e-9:
        phi_estimate = val / (2**n_ancilla)
        print(f"  |{val:03b}> (decimal {val}): prob = {p:.4f}, "
              f"phi estimate = {val}/{2**n_ancilla} = {phi_estimate:.4f}")

print(f"\nTrue phi = 1/8 = {1/8:.4f}")

The measurement yields |001> with probability 1, giving phi = 1/8 exactly. This works because 1/8 has an exact 3-bit binary representation. When phi does not have an exact binary expansion, QPE produces the closest approximation, with probability concentrated near the correct value. Adding more ancilla qubits improves precision: t ancilla qubits give t bits of precision in phi.

Approximate QFT

The QFT circuit contains controlled phase rotations with angles as small as 2*pi/2^n. For large n, these small-angle rotations have negligible effect on the output state. The approximate QFT drops all rotations below a threshold, reducing the gate count while maintaining high fidelity.

For n qubits, the smallest rotation in the full QFT is CP(2*pi / 2^n). For n = 3, the smallest rotation is pi/4 = 45 degrees, which is significant. For n = 8, the smallest rotation is pi/128 = 1.4 degrees, which barely affects the state. For n = 20, the smallest rotation is pi/524288 = 0.0003 degrees, which contributes essentially nothing.

The approximate QFT with approximation degree d drops all rotations with angle smaller than 2*pi / 2^d. This reduces the inner loop length from n to at most d rotations per qubit.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.quantum_info import Operator

def qft_approximate(n: int, max_rotations: int) -> QuantumCircuit:
    """Build an approximate QFT that keeps at most max_rotations
    controlled phase gates per qubit."""
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        # Only apply the closest max_rotations controlled phases
        for k in range(j + 1, min(j + 1 + max_rotations, n)):
            angle = 2 * np.pi / (2 ** (k - j + 1))
            qc.cp(angle, k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

def qft_circuit(n: int) -> QuantumCircuit:
    """Full QFT (all rotations)."""
    return qft_approximate(n, n)

# Compare fidelity and gate counts for different approximation levels
for n in [4, 6, 8]:
    full_qft = qft_circuit(n)
    full_op = Operator(full_qft)

    print(f"\n=== {n}-qubit QFT ===")
    full_ops = full_qft.count_ops()
    full_gate_count = sum(full_ops.values())
    print(f"Full QFT: {full_gate_count} gates, depth = {full_qft.depth()}")

    for max_rot in [1, 2, 3]:
        if max_rot >= n:
            continue
        approx_qft = qft_approximate(n, max_rot)
        approx_op = Operator(approx_qft)

        # Fidelity: |Tr(U_full^dag @ U_approx)|^2 / N^2
        product = full_op.adjoint().compose(approx_op)
        trace = np.trace(product.data)
        fidelity = (abs(trace) / (2**n)) ** 2

        approx_ops = approx_qft.count_ops()
        approx_gate_count = sum(approx_ops.values())

        savings = 100 * (1 - approx_gate_count / full_gate_count)
        print(f"  max_rot={max_rot}: {approx_gate_count} gates "
              f"(depth={approx_qft.depth()}), "
              f"fidelity = {fidelity:.6f}, "
              f"gate savings = {savings:.1f}%")

Even keeping only 2 rotations per qubit, the fidelity remains above 99% for moderate circuit sizes. This is why the approximate QFT is standard practice in real implementations: the small-angle rotations are the hardest to implement accurately on hardware (they require the longest pulse times and are most sensitive to calibration errors), and dropping them actually improves the output quality on noisy devices.

Using Qiskit’s Built-in QFT

Qiskit provides a built-in QFT implementation in its circuit library. It supports the full QFT, approximate QFT, and optional bit-reversal SWAPs.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit.library import QFT
from qiskit.quantum_info import Operator

def qft_circuit(n: int) -> QuantumCircuit:
    """Our hand-built QFT for comparison."""
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

# Built-in QFT with default settings
qft_builtin = QFT(num_qubits=3)
print("Built-in QFT(3):")
print(qft_builtin.decompose().draw(output="text"))

# Compare unitaries: hand-built vs built-in
hand_built = qft_circuit(3)
are_equivalent = Operator(hand_built).equiv(Operator(qft_builtin))
print(f"\nHand-built and built-in are equivalent: {are_equivalent}")

# Approximate QFT: drop the smallest rotations
# approximation_degree = number of least-significant rotation layers to drop
qft_approx = QFT(num_qubits=5, approximation_degree=2)
print(f"\nApproximate QFT(5, approx_degree=2) gate counts: "
      f"{qft_approx.decompose().count_ops()}")

# QFT without SWAPs (useful when you handle bit-reversal elsewhere)
qft_no_swap = QFT(num_qubits=3, do_swaps=False)
print(f"\nQFT(3, do_swaps=False) gate counts: "
      f"{qft_no_swap.decompose().count_ops()}")

# Inverse QFT directly from the library
iqft_builtin = QFT(num_qubits=3, inverse=True)
print(f"\nInverse QFT(3) from library:")
print(iqft_builtin.decompose().draw(output="text"))

The built-in QFT class accepts three key parameters:

num_qubits: the number of qubits
approximation_degree: how many layers of small-angle rotations to drop (default 0 = exact)
do_swaps: whether to include the final SWAP gates (default True)
inverse: whether to build the inverse QFT (default False)

For production code, prefer the built-in implementation. It handles edge cases correctly and integrates cleanly with Qiskit’s transpiler. Use your own implementation when you need to understand or modify the circuit structure.

Circuit Depth and Gate Count Analysis

For an n-qubit QFT, the circuit contains:

n Hadamard gates
n*(n-1)/2 controlled phase rotations
n//2 SWAP gates (each SWAP decomposes into 3 CX gates)

The total gate count is n + n*(n-1)/2 + n//2 = n*(n+1)/2 + n//2.

However, the circuit depth (the number of time steps when parallelizable gates run simultaneously) is smaller than the total gate count, because some gates on disjoint qubits can execute in the same clock cycle.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit.library import QFT

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

print(f"{'n':>3}  {'gates':>6}  {'depth':>6}  {'CPs':>5}  {'Hs':>4}  {'SWAPs':>6}")
print("-" * 40)
for n in [3, 4, 5, 6, 8, 10, 12]:
    qft = qft_circuit(n)
    ops = qft.count_ops()
    total_gates = sum(ops.values())
    depth = qft.depth()
    cps = ops.get("cp", 0)
    hs = ops.get("h", 0)
    swaps = ops.get("swap", 0)
    print(f"{n:>3}  {total_gates:>6}  {depth:>6}  {cps:>5}  {hs:>4}  {swaps:>6}")

The O(n^2) scaling is clear from the table. For n = 10, the circuit has 60 gates. For n = 12, it is 84 gates. The depth grows as O(n) because each “round” of controlled rotations (targeting one qubit) takes O(n) steps, but subsequent rounds partially overlap.

Hardware Transpilation of the QFT

On IBM quantum hardware, the native gate set is {CX, Rz, SX, X}. The QFT’s controlled phase gates (CP) are not native and must be decomposed. Each CP gate typically decomposes into 2 CX gates plus single-qubit rotations. This roughly triples the gate count.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
from qiskit_ibm_runtime.fake_provider import FakeSherbrooke

def qft_circuit(n: int) -> QuantumCircuit:
    qc = QuantumCircuit(n)
    for j in range(n):
        qc.h(j)
        for k in range(j + 1, n):
            qc.cp(2 * np.pi / (2 ** (k - j + 1)), k, j)
    for i in range(n // 2):
        qc.swap(i, n - i - 1)
    return qc

backend = FakeSherbrooke()

for n in [3, 4, 5]:
    qft = qft_circuit(n)
    qft.measure_all()

    # Transpile at optimization level 3 (maximum optimization)
    pm = generate_preset_pass_manager(optimization_level=3, backend=backend)
    transpiled = pm.run(qft)

    orig_ops = qft.count_ops()
    trans_ops = transpiled.count_ops()

    print(f"\n=== {n}-qubit QFT ===")
    print(f"Original: {sum(orig_ops.values())} gates "
          f"(depth={qft.depth()})")
    print(f"  Ops: {dict(orig_ops)}")
    print(f"Transpiled: {sum(trans_ops.values())} gates "
          f"(depth={transpiled.depth()})")
    print(f"  Ops: {dict(trans_ops)}")
    cx_count = trans_ops.get("cx", 0) + trans_ops.get("ecr", 0)
    print(f"  Two-qubit gates (CX/ECR): {cx_count}")

The two-qubit gate count after transpilation is the primary bottleneck on NISQ hardware. Each two-qubit gate has an error rate roughly 10x higher than single-qubit gates (typically 0.5-1% vs 0.01-0.05%). For a 5-qubit QFT, the transpiled circuit may contain 20 or more two-qubit gates, giving a cumulative error rate that degrades the output significantly.

This is why large QFTs are impractical on current hardware without error correction. The approximate QFT helps by reducing the two-qubit gate count, but even with aggressive truncation, QFTs beyond about 10 qubits produce noisy results on today’s devices.

Running on Real Hardware

The QFT has O(n^2) gates, which gives it a manageable depth for small n. For n = 4 qubits the circuit uses 10 controlled-phase gates and 2 SWAP gates, shallow enough for current NISQ devices to run with acceptable fidelity.

from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2 as Sampler
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager

service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False)

pm = generate_preset_pass_manager(optimization_level=3, backend=backend)
isa_circuit = pm.run(full_circuit)

sampler = Sampler(backend)
job = sampler.run([isa_circuit], shots=4096)
result = job.result()
print(result[0].data.meas.get_counts())

Real hardware results will show the correct dominant state, but shot noise and gate errors will spread probability mass to nearby states. Increasing shots reduces statistical noise; nothing short of error correction reduces gate noise.

Common Mistakes

Working with the QFT involves several subtleties that trip up even experienced practitioners.

Mistake 1: Ignoring qubit ordering conventions. The QFT formula QFT|x> = (1/sqrt(N)) * sum exp(2piixk/N) |k> defines the output with the standard bit ordering. But the QFT circuit (without SWAP gates) produces the result in bit-reversed order. Different frameworks use different conventions for which qubit is the least significant bit. Qiskit uses little-endian (qubit 0 is the least significant). Always check whether your QFT implementation includes the final SWAPs, and whether you need them for your application.

Mistake 2: Using the QFT as a standalone algorithm. The QFT transforms quantum amplitudes, not classical data. To Fourier-transform a classical signal of length N = 2^n, you first need to load those N values into the amplitudes of an n-qubit state. Generic state preparation requires O(2^n) gates, which negates the exponential speedup of the QFT. The quantum advantage exists only when the input state is prepared efficiently by another quantum subroutine (as in Shor’s algorithm or QPE).

Mistake 3: Confusing QFT and IQFT. Shor’s algorithm applies the inverse QFT (IQFT) after the modular exponentiation oracle, not the forward QFT. Some textbook presentations show the forward QFT in the circuit diagram because they define the QFT with the opposite sign convention. Always check the sign of the exponent in the definition being used. In Qiskit, qc.inverse() produces the exact inverse.

Mistake 4: Using qc.inverse() when the library version is cleaner. Calling qft_circuit(n).inverse() produces a correct IQFT, but the resulting circuit has gate angles that are not simplified (they are just negated). For cleaner circuits, use QFT(n, inverse=True) from qiskit.circuit.library, which produces an optimized inverse circuit directly.

Mistake 5: Forgetting that measurement destroys phase information. After applying the QFT, all basis states have equal probability (for a basis state input). Measuring in the computational basis gives a uniformly random outcome, revealing nothing about x. The QFT is only useful when combined with additional operations that convert the phase differences into probability differences before measurement.

Connection to Shor’s Algorithm and QPE

In Shor’s algorithm, the quantum circuit creates a periodic superposition of states |x, f(x)> where f(x) = a^x mod N. The period r of this function encodes the factors of N. The QFT extracts r by converting the periodic amplitude pattern to a sharp peak in the frequency domain. Measuring that peak and applying continued fraction expansion gives r classically, and from r the factors follow directly.

In quantum phase estimation, a unitary U with eigenvalue exp(2pii*phi) acts on an eigenstate. Phase kickback imprints phi onto a register of ancilla qubits as a fractional binary number. The IQFT converts that phase-encoded register back to a computational basis state representing phi. The precision of the estimate scales linearly with the number of ancilla qubits.

Both algorithms use the same QFT subroutine. Understanding the QFT circuit is therefore the common thread that unlocks the two most important quantum algorithms in existence.

Summary

Property	Value
Gate count	O(n^2) for n qubits
Circuit depth	O(n), accounting for parallelism
Classical DFT cost	O(n * 2^n)
Output format	Amplitudes encode Fourier coefficients as phases
Direct readout?	No (measurement gives uniform distribution for basis state inputs)
Approximate QFT fidelity	> 99% with only 2-3 rotations per qubit
Key use cases	Shor’s algorithm, QPE, quantum simulation
NISQ feasible?	Yes for small n (up to ~10 qubits with noise mitigation)
Qiskit built-in	`from qiskit.circuit.library import QFT`
Key parameters	`num_qubits`, `approximation_degree`, `do_swaps`, `inverse`

The QFT is not useful in isolation. Its power comes from its role inside larger algorithms where quantum interference allows specific Fourier coefficients to be amplified and read out without enumerating all 2^n amplitudes. The derivation of the product formula explains why the circuit works. The approximate QFT shows that you can drop most of the small-angle rotations without losing accuracy. And the connection to period finding and phase estimation shows where the QFT delivers its exponential advantage in practice.