The Bernstein-Vazirani Algorithm: Implementing a Hidden String Finder in Qiskit
Implement the Bernstein-Vazirani algorithm in Qiskit: find a hidden bit string in a single query using superposition and phase kickback, versus N queries classically.
The Bernstein-Vazirani algorithm is one of the cleanest demonstrations of quantum advantage. It solves a problem that requires N queries classically in a single quantum query. Unlike Grover’s algorithm (quadratic speedup) or Shor’s algorithm (exponential speedup on a hard problem), Bernstein-Vazirani has an exponential speedup on a structured problem: you can understand exactly why from the circuit diagram.
The Problem
You have a black-box function:
f(x) = s · x (mod 2)
where s is an unknown n-bit string and · is the bitwise dot product. Given query access to f, find s.
Classically, the only way to recover s is to query f on the basis vectors one at a time:
f(100...0)gives bit 0 of sf(010...0)gives bit 1 of s- …
f(000...1)gives bit n-1 of s
That requires exactly n queries.
Quantum mechanically, one query is enough.
The Bitwise Dot Product and Linear Algebra over GF(2)
To understand why exactly n classical queries are both necessary and sufficient, we need to look at the linear algebra behind the problem. The function f(x) = s · x mod 2 is a linear function over GF(2), the finite field with two elements {0, 1} where addition is XOR and multiplication is AND.
The bitwise dot product of two n-bit strings s and x is:
s · x = s_0 * x_0 XOR s_1 * x_1 XOR ... XOR s_{n-1} * x_{n-1}
This returns a single bit: 0 or 1.
Now consider what happens when you query f on the standard basis vectors e_0, e_1, …, e_{n-1}. The basis vector e_i has a 1 in position i and 0 everywhere else:
e_0 = 100...0
e_1 = 010...0
e_2 = 001...0
...
e_{n-1} = 000...1
When you compute s · e_i, every term in the dot product is zero except the one at position i:
s · e_i = s_0 * 0 XOR s_1 * 0 XOR ... XOR s_i * 1 XOR ... XOR s_{n-1} * 0 = s_i
So f(e_i) = s_i, the i-th bit of s. Each query to a basis vector reveals exactly one bit of s, and all n bits are independent. You cannot learn s_i from any combination of queries that does not include e_i (or some vector with a 1 in position i). This is why classical BV needs exactly n queries: not n-1, not n+1, but precisely n.
Here is a classical implementation that uses this exact strategy:
def classical_bv_oracle(secret: str):
"""
Return a classical oracle function for BV.
Takes an integer x, returns s · x mod 2.
"""
n = len(secret)
s_bits = [int(b) for b in secret]
def oracle(x: int) -> int:
# Compute bitwise dot product mod 2
result = 0
for i in range(n):
result ^= s_bits[n - 1 - i] * ((x >> i) & 1)
return result
return oracle
def classical_bv_solver(oracle_fn, n: int) -> str:
"""
Classical solver for Bernstein-Vazirani.
Queries the oracle on each standard basis vector e_i.
Returns the recovered secret string.
"""
secret_bits = []
for i in range(n):
e_i = 1 << i # Standard basis vector: 1 in position i
bit = oracle_fn(e_i) # This equals s_i
secret_bits.append(str(bit))
# Reverse because we built from LSB to MSB
return ''.join(reversed(secret_bits))
# Test the classical solver
secret = '10110'
n = len(secret)
oracle = classical_bv_oracle(secret)
# Verify each basis vector query
for i in range(n):
e_i = 1 << i
print(f"f(e_{i}) = f({e_i:0{n}b}) = {oracle(e_i)}")
recovered = classical_bv_solver(oracle, n)
print(f"\nSecret: {secret}")
print(f"Recovered: {recovered}")
print(f"Match: {secret == recovered}")
Expected output:
f(e_0) = f(00001) = 0
f(e_1) = f(00010) = 1
f(e_2) = f(00100) = 1
f(e_3) = f(01000) = 0
f(e_4) = f(10000) = 1
Secret: 10110
Recovered: 10110
Match: True
Each query reveals one bit. Five queries, five bits, done. The quantum algorithm does the same thing but queries all basis vectors simultaneously through superposition.
Why It Works: Phase Kickback (Full Derivation)
The key insight is phase kickback. Let us walk through the complete mathematical derivation step by step.
Step 1: Prepare the Initial State
We start with n input qubits and 1 ancilla qubit, all in |0>:
|ψ_0> = |0>^{⊗n} ⊗ |0>
We apply X then H to the ancilla to put it in |->:
|-> = (|0> - |1>) / √2
We apply H to each of the n input qubits. The Hadamard gate maps |0> to (|0> + |1>)/√2, so applying H to n qubits in |0> produces a uniform superposition:
H^{⊗n} |0>^{⊗n} = (1/√(2^n)) Σ_x |x>
where the sum runs over all 2^n bit strings x. The combined state is:
|ψ_1> = (1/√(2^n)) Σ_x |x> ⊗ |->
Step 2: Apply the Oracle
The oracle computes |x>|y> -> |x>|y ⊕ f(x)>. When the ancilla is in |->:
|-> = (|0> - |1>) / √2
The oracle maps:
|x>|-> = |x> ⊗ (|0> - |1>) / √2
-> |x> ⊗ (|0 ⊕ f(x)> - |1 ⊕ f(x)>) / √2
If f(x) = 0: |0 ⊕ 0> - |1 ⊕ 0> = |0> - |1> = √2 |-> If f(x) = 1: |0 ⊕ 1> - |1 ⊕ 1> = |1> - |0> = -√2 |->
In both cases, the ancilla stays in |-> (up to a sign), and the sign is (-1)^f(x):
|x>|-> -> (-1)^{f(x)} |x>|->
This is phase kickback. The oracle does not change the ancilla’s state. Instead, it kicks a phase factor back onto the input register.
Step 3: The State After the Oracle
Applying the oracle to our superposition state:
|ψ_2> = (1/√(2^n)) Σ_x (-1)^{f(x)} |x> ⊗ |->
= (1/√(2^n)) Σ_x (-1)^{s·x} |x> ⊗ |->
The ancilla factors out. Focusing on the input register only:
|ψ_2^{input}> = (1/√(2^n)) Σ_x (-1)^{s·x} |x>
Step 4: Apply the Second Hadamard Layer
Now we prove that H^{⊗n} applied to this state gives |s>. We use the identity for the n-qubit Hadamard transform:
H^{⊗n} |z> = (1/√(2^n)) Σ_x (-1)^{z·x} |x>
This means that any state of the form (1/√(2^n)) Σ_x (-1)^{z·x} |x> is simply H^{⊗n} |z>. Since H^{⊗n} is its own inverse (H^{⊗n} H^{⊗n} = I), applying H^{⊗n} to this state returns |z>.
Our input register is exactly in the state H^{⊗n} |s>:
|ψ_2^{input}> = (1/√(2^n)) Σ_x (-1)^{s·x} |x> = H^{⊗n} |s>
Applying H^{⊗n} again:
|ψ_3^{input}> = H^{⊗n} H^{⊗n} |s> = I |s> = |s>
The measurement yields s with probability 1. No randomness, no repeated trials, just one query and one measurement.
Setup
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
import numpy as np
Building the Oracle
The oracle for f(x) = s · x is built from CNOT gates: for each bit position i where s[i] = 1, apply CX(qubit_i, ancilla).
def bv_oracle(n: int, secret: str) -> QuantumCircuit:
"""
Build the Bernstein-Vazirani oracle for secret string 's'.
Circuit acts on n data qubits (0..n-1) and one ancilla qubit (n).
For each bit position i where secret[i] == '1', adds a CNOT from qubit i to ancilla.
"""
qc = QuantumCircuit(n + 1)
for i, bit in enumerate(reversed(secret)): # reversed: qubit 0 = LSB
if bit == '1':
qc.cx(i, n)
return qc
# Test: 5-bit secret string
secret = '10110'
n = len(secret)
oracle = bv_oracle(n, secret)
print(f"Oracle for secret '{secret}':")
print(oracle.draw())
The Full Algorithm
def bernstein_vazirani(secret: str) -> QuantumCircuit:
"""Build the full Bernstein-Vazirani circuit for a given secret string."""
n = len(secret)
qc = QuantumCircuit(n + 1, n) # n data qubits + 1 ancilla, n classical bits
# Step 1: initialise ancilla in |->
qc.x(n) # flip ancilla to |1>
qc.h(n) # put ancilla in |-> = (|0> - |1>) / sqrt(2)
# Step 2: put input register in uniform superposition
qc.h(range(n))
qc.barrier()
# Step 3: apply oracle (phase kickback encodes s in phases)
oracle = bv_oracle(n, secret)
qc.compose(oracle, inplace=True)
qc.barrier()
# Step 4: apply H again to decode the phase into the computational basis
qc.h(range(n))
# Step 5: measure (ancilla not measured -- it's still in |->)
qc.measure(range(n), range(n))
return qc
secret = '10110'
qc = bernstein_vazirani(secret)
print(qc.draw())
Running the Algorithm
backend = AerSimulator()
job = backend.run(qc, shots=1024)
counts = job.result().get_counts()
print(f"Secret: '{secret}'")
print(f"Measured: {counts}")
# Should measure '10110' with probability 1 (or close to it on a noiseless simulator)
Expected output:
Secret: '10110'
Measured: {'10110': 1024}
On a noiseless simulator, the algorithm is deterministic: it always finds the secret string in a single shot. There is no probabilistic element; unlike Grover’s algorithm, no repetition is needed.
Amplitude Visualization: Tracking the State Step by Step
To build real intuition for how BV works, let us trace the amplitude of every basis state through each step of the algorithm. We use a concrete example with n=3 qubits and secret string s=‘101’.
The input register has 2^3 = 8 basis states: |000>, |001>, |010>, |011>, |100>, |101>, |110>, |111>. At each step, we track the amplitude (a complex number) of each state.
The Four Stages
Stage (a): Initial state |000>
All amplitude sits on |000>:
| State | Amplitude |
|---|---|
| 000 | 1 |
| 001 | 0 |
| 010 | 0 |
| 011 | 0 |
| 100 | 0 |
| 101 | 0 |
| 110 | 0 |
| 111 | 0 |
Stage (b): After H on all 3 input qubits
Uniform superposition with amplitude 1/sqrt(8) on every state:
| State | Amplitude |
|---|---|
| 000 | 1/sqrt(8) ≈ 0.354 |
| 001 | 1/sqrt(8) ≈ 0.354 |
| 010 | 1/sqrt(8) ≈ 0.354 |
| 011 | 1/sqrt(8) ≈ 0.354 |
| 100 | 1/sqrt(8) ≈ 0.354 |
| 101 | 1/sqrt(8) ≈ 0.354 |
| 110 | 1/sqrt(8) ≈ 0.354 |
| 111 | 1/sqrt(8) ≈ 0.354 |
Stage (c): After the oracle
The oracle applies a phase of (-1)^(s·x) to each state |x>. For s=‘101’, we compute s·x for each x:
- s·000 = 1*0 + 0*0 + 1*0 = 0, so (-1)^0 = +1
- s·001 = 1*0 + 0*0 + 1*1 = 1, so (-1)^1 = -1
- s·010 = 1*0 + 0*1 + 1*0 = 0, so (-1)^0 = +1
- s·011 = 1*0 + 0*1 + 1*1 = 1, so (-1)^1 = -1
- s·100 = 1*1 + 0*0 + 1*0 = 1, so (-1)^1 = -1
- s·101 = 1*1 + 0*0 + 1*1 = 0, so (-1)^0 = +1
- s·110 = 1*1 + 0*1 + 1*0 = 1, so (-1)^1 = -1
- s·111 = 1*1 + 0*1 + 1*1 = 0, so (-1)^0 = +1
| State | Amplitude |
|---|---|
| 000 | +1/sqrt(8) ≈ +0.354 |
| 001 | -1/sqrt(8) ≈ -0.354 |
| 010 | +1/sqrt(8) ≈ +0.354 |
| 011 | -1/sqrt(8) ≈ -0.354 |
| 100 | -1/sqrt(8) ≈ -0.354 |
| 101 | +1/sqrt(8) ≈ +0.354 |
| 110 | -1/sqrt(8) ≈ -0.354 |
| 111 | +1/sqrt(8) ≈ +0.354 |
The magnitudes are all the same: 1/sqrt(8). Only the phases have changed. If you measured now, you would get each basis state with equal probability 1/8. The secret is encoded entirely in the signs.
Stage (d): After the second H^3
Constructive and destructive interference collapse all amplitude onto |101>:
| State | Amplitude |
|---|---|
| 000 | 0 |
| 001 | 0 |
| 010 | 0 |
| 011 | 0 |
| 100 | 0 |
| 101 | 1 |
| 110 | 0 |
| 111 | 0 |
Measurement gives ‘101’ with certainty.
Verifying with Qiskit’s Statevector Simulator
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector
import numpy as np
secret = '101'
n = len(secret)
# Build the circuit without measurement so we can track statevectors
qc = QuantumCircuit(n + 1)
# Prepare ancilla in |->
qc.x(n)
qc.h(n)
# Stage (a) -> (b): Apply H to input qubits
qc_stage_b = qc.copy()
qc_stage_b.h(range(n))
# Stage (b) -> (c): Apply oracle
qc_stage_c = qc_stage_b.copy()
for i, bit in enumerate(reversed(secret)):
if bit == '1':
qc_stage_c.cx(i, n)
# Stage (c) -> (d): Apply second H layer
qc_stage_d = qc_stage_c.copy()
qc_stage_d.h(range(n))
# Extract amplitudes at each stage (tracing out ancilla by looking at input qubits)
for label, circuit in [('After H (stage b)', qc_stage_b),
('After oracle (stage c)', qc_stage_c),
('After 2nd H (stage d)', qc_stage_d)]:
sv = Statevector.from_instruction(circuit)
# The ancilla is qubit n, in |-> = (|0> - |1>)/sqrt(2).
# We extract amplitudes for the input register by looking at
# basis states where ancilla is |0>, and multiplying by sqrt(2).
# (Alternatively, trace out the ancilla.)
probs = sv.probabilities(qargs=list(range(n)))
# For phase information, look at the full statevector
data = np.array(sv.data)
print(f"\n{label}:")
for x in range(2**n):
# Amplitude of |x>|0> for the ancilla-0 branch (scale by sqrt(2))
idx_anc0 = x # ancilla is qubit n (highest index in Qiskit)
amp = data[idx_anc0] * np.sqrt(2) # scale out ancilla factor 1/sqrt(2)
state_str = format(x, f'0{n}b')
if abs(amp) > 1e-10:
print(f" |{state_str}> : {amp.real:+.4f}")
This code tracks the actual numerical amplitudes at each stage, confirming the hand calculations above.
Why the Second Hadamard Works
After phase kickback, the state of the input register is:
(1/sqrt(2^n)) * sum_x (-1)^(s·x) |x>
This is the n-qubit quantum Fourier transform of |s>. Applying H^n (Hadamard on each qubit) inverts it, returning exactly |s>. The measurement then gives s with probability 1.
This is the quantum Fourier transform at work in its simplest setting. The same structure (superposition, phase encoding, interference) underlies QPE, Shor’s algorithm, and QFT-based signal processing.
Oracle Depth Analysis
Not all BV oracles are equally deep. The circuit depth depends on the Hamming weight of the secret string, that is, the number of 1s in s.
For a secret string s with k ones out of n bits, the oracle uses exactly k CNOT gates. These k CNOTs all target the same ancilla qubit. Because they share a target, they cannot run in parallel; they must execute sequentially. The oracle depth is therefore exactly k.
Worst case: s = ‘111…1’ (all ones). The oracle has n CNOT gates, all in series, giving depth n.
Best case: s = ‘1000…0’ (a single one). The oracle has 1 CNOT gate, giving depth 1.
Zero case: s = ‘0000…0’ (all zeros). The oracle has 0 gates, giving depth 0. The algorithm returns ‘000…0’ trivially.
The total circuit depth for BV is:
Total depth = 1 (H layer) + k (oracle CNOTs) + 1 (H layer) = k + 2
This means BV circuits for low-weight secrets are shallower and more resistant to noise on real hardware.
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
def analyze_oracle_depth(secret: str) -> dict:
"""Analyze the circuit depth for a given BV secret string."""
n = len(secret)
qc = bernstein_vazirani(secret)
# Transpile to count depth (basis gates)
transpiled = transpile(qc, basis_gates=['cx', 'h', 'x', 'measure'], optimization_level=0)
hamming_weight = secret.count('1')
return {
'secret': secret,
'n': n,
'hamming_weight': hamming_weight,
'num_cnots': hamming_weight,
'total_depth': transpiled.depth(),
}
# Compare different secrets
secrets = ['00000', '10000', '10100', '10110', '11110', '11111']
print(f"{'Secret':>8s} {'Weight':>6s} {'CNOTs':>5s} {'Depth':>5s}")
print("-" * 35)
for s in secrets:
info = analyze_oracle_depth(s)
print(f"{info['secret']:>8s} {info['hamming_weight']:>6d} {info['num_cnots']:>5d} {info['total_depth']:>5d}")
Expected output:
Secret Weight CNOTs Depth
-----------------------------------
00000 0 0 3
10000 1 1 4
10100 2 2 5
10110 3 3 6
11110 4 4 7
11111 5 5 8
The depth grows linearly with the Hamming weight of s. On noisy hardware, lower-weight secrets produce cleaner results because fewer two-qubit gates means less accumulated error.
Connection to Fourier Analysis over GF(2)^n
The BV algorithm has a beautiful interpretation through the lens of Fourier analysis. The function f(x) = s · x mod 2 is not just any function: it is a Fourier character of the group GF(2)^n.
The Walsh-Hadamard Transform
Over the group GF(2)^n (n-bit strings under XOR), the characters are the functions:
χ_s(x) = (-1)^{s·x}
for each s in GF(2)^n. Any function g: GF(2)^n -> {+1, -1} can be written as a sum of these characters. The Walsh-Hadamard transform decomposes g into its Fourier coefficients:
g_hat(s) = (1/2^n) Σ_x g(x) · χ_s(x) = (1/2^n) Σ_x g(x) · (-1)^{s·x}
BV as Fourier Transform
When the BV oracle encodes f(x) = s · x into phases, the quantum state becomes:
(1/√(2^n)) Σ_x (-1)^{s·x} |x>
This is exactly the state you get by applying the Walsh-Hadamard transform (H^{⊗n}) to |s>. The function (-1)^{s·x} is a single Fourier character, so its Fourier transform is a delta function peaked at s.
The second H^{⊗n} layer in the algorithm is the inverse Walsh-Hadamard transform. Since the Walsh-Hadamard transform is its own inverse (it is an involution), applying it twice returns you to the original basis:
H^{⊗n} [ (1/√(2^n)) Σ_x (-1)^{s·x} |x> ] = |s>
This is why BV gives a deterministic answer. The phase pattern (-1)^{s·x} is a pure Fourier mode, and the inverse transform recovers the frequency (which is s) exactly. There is no spectral leakage, no approximation, no uncertainty.
The Big Picture
This makes BV the simplest quantum algorithm that directly implements a Fourier transform. The same idea scales up:
- In Simon’s algorithm, the function hides a subgroup structure, and the Fourier transform over GF(2)^n reveals vectors orthogonal to the hidden period.
- In Shor’s algorithm, the function hides a periodic structure over the integers, and the quantum Fourier transform over Z_N reveals the period.
- In quantum phase estimation, the function hides a phase, and the QFT reveals it.
BV is where this entire family of algorithms starts.
Scaling: Classical vs Quantum
def classical_bv(oracle_fn, n: int) -> str:
"""
Classical algorithm for Bernstein-Vazirani.
Requires n queries: probe each basis vector e_i.
"""
secret_bits = []
for i in range(n):
x = (1 << i) # bit vector with only position i set
secret_bits.append(oracle_fn(x))
return ''.join(str(b) for b in reversed(secret_bits))
# Demonstrate scaling comparison
for n in [5, 10, 20, 50, 100]:
print(f"n={n:3d}: classical needs {n:3d} queries, quantum needs 1")
n= 5: classical needs 5 queries, quantum needs 1
n= 10: classical needs 10 queries, quantum needs 1
n= 20: classical needs 20 queries, quantum needs 1
n= 50: classical needs 50 queries, quantum needs 1
n=100: classical needs 100 queries, quantum needs 1
Verification: Try Different Secrets
from qiskit_aer import AerSimulator
backend = AerSimulator()
test_secrets = ['0000', '1111', '10101010', '11001100', '01010101']
for secret in test_secrets:
qc = bernstein_vazirani(secret)
counts = backend.run(qc, shots=256).result().get_counts()
measured = max(counts, key=counts.get)
# The oracle is built with reversed(secret) so qubit 0 = LSB of secret.
# Qiskit's measurement bitstring reads classical bits MSB-first,
# matching the secret string directly.
print(f"Secret: {secret!r:12s} Measured: {measured!r:12s} Match: {secret == measured}")
Multi-Query BV for Noisy Hardware
On a perfect simulator, one shot suffices. On real hardware, noise corrupts the result. Gate errors, readout errors, and decoherence all conspire to flip bits in the output. The fix is simple: run the circuit many times and take a majority vote for each bit position.
Why Majority Voting Works
Each bit of the output string is independently correct with probability p > 0.5 (assuming the hardware is not totally broken). If you run the circuit N times and take the majority value for each bit, the probability of getting the wrong answer for that bit drops exponentially with N.
For a single bit with error rate e (probability of being wrong), the probability that the majority vote across N trials is wrong is bounded by:
P(error) ≤ exp(-2N(0.5 - e)^2)
This is a direct application of the Hoeffding inequality.
Implementation
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel, depolarizing_error
from collections import Counter
def bv_majority_vote(secret: str, shots: int = 100, error_rate: float = 0.01):
"""
Run BV multiple times and take majority vote per bit position.
Uses a noise model to simulate real hardware.
"""
n = len(secret)
# Create a simple noise model: depolarizing error on CX gates
noise_model = NoiseModel()
cx_error = depolarizing_error(error_rate, 2)
noise_model.add_all_qubit_quantum_error(cx_error, ['cx'])
# Also add single-qubit gate error (typically ~10x smaller)
sq_error = depolarizing_error(error_rate / 10, 1)
noise_model.add_all_qubit_quantum_error(sq_error, ['h', 'x'])
backend = AerSimulator(noise_model=noise_model)
qc = bernstein_vazirani(secret)
# Run with many shots
counts = backend.run(qc, shots=shots).result().get_counts()
# Majority vote: find the most common result
most_common = max(counts, key=counts.get)
confidence = counts[most_common] / shots
# Per-bit majority vote (more robust for high noise)
bit_votes = [Counter() for _ in range(n)]
for bitstring, count in counts.items():
for i, bit in enumerate(bitstring):
bit_votes[i][bit] += count
majority_result = ''.join(
votes.most_common(1)[0][0] for votes in bit_votes
)
return {
'most_common': most_common,
'majority_vote': majority_result,
'confidence': confidence,
'correct_most_common': most_common == secret,
'correct_majority': majority_result == secret,
}
# Test with different error rates
secret = '10110'
print(f"Secret: {secret}")
print(f"{'Error Rate':>12s} {'Most Common':>12s} {'Majority':>10s} {'Confidence':>10s}")
print("-" * 55)
for error_rate in [0.001, 0.005, 0.01, 0.02, 0.05, 0.10]:
result = bv_majority_vote(secret, shots=1000, error_rate=error_rate)
print(f"{error_rate:>12.3f} {result['most_common']:>12s} "
f"{result['majority_vote']:>10s} {result['confidence']:>10.1%}")
At low error rates (0.1% to 1%), both methods recover the correct string reliably. As the error rate climbs toward 5% to 10%, per-bit majority voting remains more robust than simply taking the single most common bitstring, because it independently corrects each bit position.
Success Rate Table
For a 5-bit secret with 3 CNOTs, the approximate success probability per shot depends on the two-qubit gate error rate:
| CX Error Rate | P(correct per shot) | P(correct, 100-shot majority) |
|---|---|---|
| 0.1% | ~99.7% | ~99.99% |
| 0.5% | ~98.5% | ~99.99% |
| 1% | ~97.0% | ~99.99% |
| 2% | ~94.1% | ~99.9% |
| 5% | ~85.7% | ~99% |
| 10% | ~72.9% | ~95% |
The majority vote approach makes BV extremely robust to moderate noise levels. Even with 10% CX error (far worse than current hardware), 100 shots give about 95% success.
Recursive Bernstein-Vazirani
The standard BV algorithm finds a hidden string in one query, but Bernstein and Vazirani’s original 1993 paper went further. They defined a recursive version that demonstrates an even stronger quantum advantage: a separation between quantum and classical query complexity that is provably exponential.
The Recursive Oracle Structure
In the recursive version, you have a tree of oracles. At the bottom level (level 0), there are many BV oracles, each hiding its own secret string s_i. At level 1, another BV oracle hides a string whose bits are determined by the secrets recovered from level 0. This nesting continues for d levels.
More precisely:
- Level 0: You have 2^n independent BV oracles O_{0,1}, O_{0,2}, …, each hiding a secret s_{0,i}.
- Level 1: A BV oracle O_1 where the j-th “query” to O_1 requires you to first recover the secret s_{0,j} from a level-0 oracle, then use that secret to compute the j-th bit of the level-1 answer.
- Level d: The top-level oracle, whose secret is the final answer.
Why Classical Algorithms Need Exponential Queries
A classical algorithm at level d needs n queries to solve the top-level BV problem. But each of those n queries requires solving a level-(d-1) BV problem, which in turn requires n queries at level (d-2), and so on. The total number of classical queries is:
Classical queries = n^d
This grows exponentially with the recursion depth d.
Why Quantum Algorithms Need Polynomial Queries
A quantum algorithm at each level solves BV in 1 query. But that single query involves running a level-(d-1) quantum BV circuit. The total number of oracle calls at the base level is:
Quantum queries = n * d
Wait, we need to be more careful. At each level, a single quantum query to the level-k oracle requires running the level-(k-1) circuit in superposition. The total number of base-level oracle calls is actually n (for each level, you need to evaluate the oracle on a superposition of inputs, but each evaluation chains down to level 0). The total is:
Quantum queries at level 0 = O(n * d)
This gives a separation of n^d (classical) versus O(n * d) (quantum), which is exponential in d.
Significance
This recursive construction was historically important because it provided one of the first provable exponential separations between quantum and classical query complexity. While the standard BV problem has only a linear speedup (n queries to 1), the recursive version amplifies this into an exponential gap by composing the linear advantage across d levels.
This result helped motivate the search for more natural problems with exponential quantum speedups, eventually leading to Simon’s algorithm and Shor’s algorithm.
Running on Real Hardware
On real IBM Quantum hardware, the algorithm still works well because the circuit is shallow (only k CNOT gates for a secret with Hamming weight k), which means noise has limited time to accumulate. BV serves as a useful hardware benchmark for three reasons:
(a) Fixed circuit depth. The circuit has exactly 3 layers of gates: H layer, oracle CNOTs, H layer. This makes it easy to compare across hardware platforms without worrying about compilation differences.
(b) Tunable CNOT count. The number of CNOTs is exactly the Hamming weight of s. You can test hardware at different CNOT counts simply by choosing different secrets, from 1 CNOT (s=‘10000’) up to n CNOTs (s=‘11111’).
(c) Success probability directly measures fidelity. If the hardware has CX fidelity F per gate and the secret has k ones, the probability of getting the correct answer is approximately F^k (ignoring single-qubit errors and readout errors, which are typically smaller). This gives a direct, interpretable benchmark number.
Full Circuit for n=5
secret = '10110'
n = len(secret)
qc = bernstein_vazirani(secret)
print("Full BV circuit for s='10110':")
print(qc.draw())
The circuit has 5 input qubits, 1 ancilla, and 5 classical bits. You can see three clear sections separated by barriers: the H layer, the 3 CNOT gates (for bits 1, 2, 4 where s has 1s), and the final H layer.
Interpreting Hardware Results
from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
service = QiskitRuntimeService(channel="ibm_quantum")
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=6)
secret = '10110'
qc = bernstein_vazirani(secret)
# Transpile to ISA circuit
pm = generate_preset_pass_manager(backend=backend, optimization_level=1)
isa_qc = pm.run(qc)
with SamplerV2(mode=backend) as sampler:
result = sampler.run([isa_qc], shots=4096).result()
counts = result[0].data.meas.get_counts()
# Sort by count, show top 5 results
sorted_counts = sorted(counts.items(), key=lambda x: x[1], reverse=True)
total = sum(counts.values())
print(f"Secret: {secret}")
print(f"\nTop 5 results out of {total} shots:")
for bitstring, count in sorted_counts[:5]:
marker = " <-- correct" if bitstring == secret else ""
print(f" {bitstring}: {count:4d} ({count/total:.1%}){marker}")
On current hardware, you should see the correct answer ‘10110’ dominating with 70% to 90% of the shots, depending on the device. The remaining probability spreads across other strings, especially strings that differ from ‘10110’ by one bit (Hamming distance 1). This is because single gate errors tend to flip individual qubits.
If the correct answer appears in less than 50% of shots, the device has poor fidelity for this circuit depth. If the error is dominated by a single qubit (e.g., ‘10010’ appears much more often than ‘10100’), that suggests one particular CX gate or qubit readout has higher error than the others.
Adversarial Oracle: What If the Oracle Is Wrong?
The BV algorithm assumes the oracle perfectly implements f(x) = s · x. But what if the oracle has an error? This thought experiment reveals something important about BV’s fragility.
A Perturbed Oracle
Suppose the oracle is meant to implement s = ‘101’ (on 3 qubits), but one CNOT is flipped. Instead of CNOTs on qubits 0 and 2, the oracle has CNOTs on qubits 0 and 1. This implements f(x) = s’ · x where s’ = ‘011’ instead of s = ‘101’.
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
def bv_oracle_perturbed(n: int) -> QuantumCircuit:
"""
Oracle intended for s='101' but with a mistake:
CNOT on qubit 1 instead of qubit 2.
Actually implements s='011'.
"""
qc = QuantumCircuit(n + 1)
qc.cx(0, n) # Correct: qubit 0 for the '1' in position 2 (LSB)
qc.cx(1, n) # Wrong: should be qubit 2, but we put qubit 1
return qc
def bv_with_custom_oracle(oracle: QuantumCircuit, n: int) -> QuantumCircuit:
"""Build BV circuit with a custom oracle."""
qc = QuantumCircuit(n + 1, n)
qc.x(n)
qc.h(n)
qc.h(range(n))
qc.barrier()
qc.compose(oracle, inplace=True)
qc.barrier()
qc.h(range(n))
qc.measure(range(n), range(n))
return qc
n = 3
oracle = bv_oracle_perturbed(n)
qc = bv_with_custom_oracle(oracle, n)
backend = AerSimulator()
counts = backend.run(qc, shots=1024).result().get_counts()
print(f"Intended secret: '101'")
print(f"Measured result: {max(counts, key=counts.get)}")
print(f"Counts: {counts}")
Expected output:
Intended secret: '101'
Measured result: 011
Counts: {'011': 1024}
BV faithfully reports what the oracle actually computes, not what it was supposed to compute. If the oracle is wrong, BV returns the wrong answer with 100% confidence. There is no self-correction mechanism.
Contrast with Grover’s Algorithm
Grover’s algorithm is more robust to small oracle perturbations. Because Grover uses multiple queries (about sqrt(N) of them) and amplifies the marked state through repeated interference, a small perturbation in the oracle results in a small perturbation in the output distribution. BV uses exactly one query, so any oracle error directly corrupts the result.
This is a general principle: algorithms that use more queries can average over oracle errors, while single-query algorithms are maximally sensitive to them. BV trades robustness for efficiency.
Generalizing to GF(2^k)
The standard BV algorithm works over GF(2), the field with two elements. Each symbol in the secret string is a single bit. We can generalize to larger fields: GF(2^k), where each symbol is a k-bit string and field operations are defined by polynomial arithmetic.
The Setup
Instead of an n-bit secret string over GF(2), consider an m-symbol secret string over GF(2^k):
s = (s_1, s_2, ..., s_m) where each s_i is in GF(2^k)
The function to learn is:
f(x) = s_1 * x_1 + s_2 * x_2 + ... + s_m * x_m (over GF(2^k))
where * and + are field multiplication and addition in GF(2^k).
Quantum Encoding
Each symbol s_i requires k qubits to represent, so the total number of qubits for the input register is m * k. The oracle applies a phase based on the GF(2^k) inner product.
The key insight: since GF(2^k) arithmetic can be decomposed into operations on k individual bits (using the polynomial representation of the field), the oracle circuit consists of CNOT gates arranged according to the multiplication tables of GF(2^k).
A Simple Example: GF(4)
GF(4) = GF(2^2) has elements {0, 1, α, α+1} where α^2 + α + 1 = 0. Each element is represented by 2 bits:
0 = 00, 1 = 01, α = 10, α+1 = 11
For a “secret” consisting of a single GF(4) symbol, you need 2 qubits per register position. A single quantum query to the generalized BV oracle reveals the 2-bit symbol, whereas a classical algorithm over GF(4) still requires standard basis queries.
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
def gf4_multiply(a: int, b: int) -> int:
"""
Multiply two elements of GF(4) = GF(2^2).
Elements represented as 2-bit integers.
Irreducible polynomial: x^2 + x + 1.
"""
if a == 0 or b == 0:
return 0
# GF(4) multiplication table
# Elements: 0=00, 1=01, alpha=10, alpha+1=11
table = {
(1, 1): 1, (1, 2): 2, (1, 3): 3,
(2, 1): 2, (2, 2): 3, (2, 3): 1,
(3, 1): 3, (3, 2): 1, (3, 3): 2,
}
return table.get((a, b), 0)
def gf4_trace(a: int) -> int:
"""
Trace function from GF(4) to GF(2).
Tr(a) = a + a^2 (in GF(4)).
This maps GF(4) to {0, 1}.
"""
a_sq = gf4_multiply(a, a)
return a ^ a_sq # XOR is addition in GF(2^k)
# Demonstrate: for secret symbol s=alpha (10), show that
# querying on each basis element reveals the secret via the trace
secret = 2 # alpha = 10 in binary
print(f"Secret: {secret} = {secret:02b}")
for x in range(4):
product = gf4_multiply(secret, x)
trace = gf4_trace(product)
print(f" f({x:02b}) = Tr(s * {x:02b}) = Tr({product:02b}) = {trace}")
This generalization is relevant to quantum linear algebra algorithms and quantum error correction codes defined over larger fields.
Relationship to Quantum Machine Learning
The BV algorithm’s structure (prepare a superposition, apply an oracle, interfere to extract information) appears throughout quantum machine learning. The phase kickback mechanism is directly related to how quantum kernels compute inner products.
BV as a Quantum Kernel
In classical machine learning, a kernel function K(x, y) measures similarity between data points. In quantum kernel methods, you encode classical data into quantum states and measure their overlap.
The BV oracle for secret s creates the mapping:
x -> (-1)^{s·x}
This is a binary feature map: it maps each n-bit string x to a sign based on its inner product with s. The quantum state after the oracle, (1/sqrt(2^n)) Σ_x (-1)^{s·x} |x>, is a feature state that encodes the relationship between all inputs x and the hidden parameter s.
A Minimal Quantum Kernel Example
Here is a minimal example showing how a BV-style circuit computes a quantum kernel value:
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector
import numpy as np
def bv_feature_map(data: str, n: int) -> QuantumCircuit:
"""
Create a BV-style feature map circuit.
Maps classical data string to a quantum state using phase kickback.
"""
qc = QuantumCircuit(n)
# Hadamard layer
qc.h(range(n))
# Phase encoding based on data
for i, bit in enumerate(reversed(data)):
if bit == '1':
qc.z(i) # Z gate applies (-1) phase to |1> component
# Second Hadamard layer
qc.h(range(n))
return qc
def quantum_kernel(x: str, y: str) -> float:
"""
Compute quantum kernel value |<phi(x)|phi(y)>|^2
using BV-style feature maps.
"""
n = len(x)
assert len(y) == n
# Get statevectors for both feature maps
sv_x = Statevector.from_instruction(bv_feature_map(x, n))
sv_y = Statevector.from_instruction(bv_feature_map(y, n))
# Kernel value is the squared overlap
overlap = np.abs(sv_x.inner(sv_y)) ** 2
return overlap
# Compute kernel matrix for a small dataset
data_points = ['000', '001', '010', '011', '100', '101', '110', '111']
n = 3
print("Quantum kernel matrix (BV-style feature map):")
print(f"{'':>4s}", end="")
for y in data_points:
print(f" {y:>5s}", end="")
print()
for x in data_points:
print(f"{x:>4s}", end="")
for y in data_points:
k = quantum_kernel(x, y)
print(f" {k:>5.2f}", end="")
print()
The kernel matrix reveals which data points the BV feature map considers “similar.” Points that differ in more bit positions have smaller kernel values. This is the same inner-product structure that BV exploits, now repurposed for classification.
In practice, quantum kernel methods for real machine learning tasks use more expressive feature maps than the simple BV-style map shown here. But the core mechanism (Hadamard, phase encoding, Hadamard) is the same, and understanding it through BV provides the conceptual foundation.
Common Mistakes
If you are implementing BV for the first time, watch out for these pitfalls. Each one has tripped up many people, including experienced quantum programmers.
1. Reversing the Secret String (LSB vs MSB)
Qiskit uses little-endian qubit ordering: qubit 0 is the least significant bit. When you build the oracle, you must reverse the secret string so that the first character of the string corresponds to the highest-index qubit.
# WRONG: this maps secret[0] to qubit 0 (LSB)
def bad_oracle(n, secret):
qc = QuantumCircuit(n + 1)
for i, bit in enumerate(secret): # No reversal!
if bit == '1':
qc.cx(i, n)
return qc
# RIGHT: reverse the secret so qubit 0 = LSB
def good_oracle(n, secret):
qc = QuantumCircuit(n + 1)
for i, bit in enumerate(reversed(secret)): # Reversed!
if bit == '1':
qc.cx(i, n)
return qc
# Demonstration of the bug
from qiskit_aer import AerSimulator
secret = '10110'
n = len(secret)
# With the wrong oracle
qc_bad = QuantumCircuit(n + 1, n)
qc_bad.x(n); qc_bad.h(n); qc_bad.h(range(n))
qc_bad.compose(bad_oracle(n, secret), inplace=True)
qc_bad.h(range(n)); qc_bad.measure(range(n), range(n))
# With the correct oracle
qc_good = bernstein_vazirani(secret)
backend = AerSimulator()
bad_result = max(backend.run(qc_bad, shots=100).result().get_counts(), key=lambda x: x[1])
good_result = max(backend.run(qc_good, shots=100).result().get_counts(), key=lambda x: x[1])
print(f"Secret: {secret}")
print(f"Bad oracle: {bad_result} (reversed!)")
print(f"Good oracle: {good_result} (correct)")
2. Forgetting to Prepare the Ancilla in |->
The ancilla must be in the |-> state for phase kickback to work. If you leave it in |0>, the oracle computes f(x) into the ancilla but does not kick back any phase. The input register stays in a uniform superposition, and you measure a random string.
# WRONG: ancilla stays in |0>
def bv_no_ancilla_prep(secret: str) -> QuantumCircuit:
n = len(secret)
qc = QuantumCircuit(n + 1, n)
# Missing: qc.x(n) and qc.h(n)
qc.h(range(n))
qc.barrier()
oracle = bv_oracle(n, secret)
qc.compose(oracle, inplace=True)
qc.barrier()
qc.h(range(n))
qc.measure(range(n), range(n))
return qc
# This produces random results, not the secret
backend = AerSimulator()
qc = bv_no_ancilla_prep('101')
counts = backend.run(qc, shots=1024).result().get_counts()
print("Without ancilla prep (WRONG):")
print(f" Got {len(counts)} different results instead of 1")
print(f" Top results: {sorted(counts.items(), key=lambda x: -x[1])[:4]}")
3. Measuring the Ancilla Qubit
The ancilla stays in |-> throughout the algorithm. It does not carry useful information, and measuring it is unnecessary. While measuring it does not break the algorithm (the ancilla is already factored out from the input register), it wastes a classical bit and can cause confusion when reading results.
# WRONG: measuring all qubits including ancilla
def bv_measure_all(secret: str) -> QuantumCircuit:
n = len(secret)
qc = QuantumCircuit(n + 1, n + 1) # n+1 classical bits
qc.x(n); qc.h(n); qc.h(range(n))
qc.barrier()
oracle = bv_oracle(n, secret)
qc.compose(oracle, inplace=True)
qc.barrier()
qc.h(range(n))
qc.measure(range(n + 1), range(n + 1)) # Measures ancilla too
return qc
# The result strings now have n+1 bits, and you need to strip the ancilla bit
# This is confusing and error-prone. Just measure the n input qubits.
4. Confusing Dot Product mod 2 with Regular Dot Product
The BV dot product is computed mod 2. This is XOR of the bitwise AND, not the integer dot product. For example:
s = '111', x = '111'
Integer dot product: 1*1 + 1*1 + 1*1 = 3
Bitwise dot product mod 2: (1*1) XOR (1*1) XOR (1*1) = 1 XOR 1 XOR 1 = 1
If you implement the classical oracle using regular addition instead of XOR, you get wrong results for any pair where the dot product exceeds 1:
# WRONG
def bad_classical_oracle(secret, x):
return sum(int(s) * int(xi) for s, xi in zip(secret, bin(x)[2:].zfill(len(secret))))
# RIGHT
def good_classical_oracle(secret, x):
result = 0
for s, xi in zip(secret, bin(x)[2:].zfill(len(secret))):
result ^= int(s) * int(xi) # XOR, not addition
return result
# Show the difference
secret = '111'
x = 0b111
print(f"Bad oracle (sum): f({x:03b}) = {bad_classical_oracle(secret, x)}")
print(f"Good oracle (XOR): f({x:03b}) = {good_classical_oracle(secret, x)}")
5. Running Only One Shot on Hardware
On a noiseless simulator, one shot always gives the correct answer. On real hardware, one shot might give the wrong answer due to noise. Always run at least 1000 shots and take the most frequent result.
# WRONG on real hardware:
# counts = backend.run(qc, shots=1).result().get_counts()
# This gives you one random sample, which might be wrong.
# RIGHT on real hardware:
# counts = backend.run(qc, shots=4096).result().get_counts()
# most_likely = max(counts, key=counts.get)
6. Building the Oracle for the Wrong Qubit Indices
When composing the oracle into the full circuit, make sure the CNOT control qubits line up with the correct data qubits. If your circuit has qubits [q0, q1, q2, q3, ancilla] and you build the oracle on a separate QuantumCircuit, the compose method maps by qubit index. Double-check that qubit 0 in the oracle maps to qubit 0 in the full circuit, not to the ancilla.
# The bv_oracle function above handles this correctly:
# it builds a circuit on n+1 qubits where qubits 0..n-1 are data
# and qubit n is the ancilla. When composed into the full circuit
# (which also has qubits 0..n-1 as data and qubit n as ancilla),
# the mapping is correct.
# Problems arise if you reorder qubits or use non-standard layouts.
Limitations
Bernstein-Vazirani demonstrates a clear quantum advantage, but with caveats:
The problem is contrived. The function f(x) = s · x mod 2 was designed to make the quantum algorithm shine. Real-world hidden-string problems are not structured this way.
The speedup is linear, not exponential in a practical sense. Going from n queries to 1 query is an n-fold improvement. This is less dramatic than Shor’s exponential speedup, and the overhead of maintaining a quantum computer is currently much larger than n queries for any practical n.
The oracle is a black box. In practice, you would not use BV to find a hidden string, because constructing the oracle requires knowing the string first. The algorithm’s value is pedagogical: it isolates and demonstrates phase kickback in its simplest form.
Connection to Other Algorithms
Bernstein-Vazirani is a direct precursor to:
- Simon’s algorithm: replaces the linear hidden structure with a hidden XOR subgroup, achieving exponential speedup over the best possible classical algorithm
- Grover’s algorithm: uses amplitude amplification instead of exact phase cancellation; quadratic rather than linear speedup
- Quantum Fourier Transform: the H^n transform in BV is the simplest instance of the QFT acting on the uniform superposition
Learning Bernstein-Vazirani builds the intuition needed to understand all three.
Was this tutorial helpful?