• Algorithms

Quantum Kernel

A kernel function computed by a quantum circuit that maps classical data to a quantum feature space, used to power kernel-based machine learning algorithms such as support vector machines.

Kernel methods are a classical machine learning workhorse. Support vector machines, Gaussian processes, and kernel PCA all rely on a kernel function k(x,x)k(x, x') that measures similarity between data points in some feature space without ever explicitly constructing that space. The idea behind quantum kernels is to use a quantum circuit as a feature map, encoding classical data into a quantum state, and then estimating the inner product between two such states as the kernel value. If the quantum feature space is hard to simulate classically, a quantum kernel might capture structure in data that classical kernels cannot.

This approach is attractive because it separates the question of what quantum circuits do from the question of how to train them. The training stays entirely classical (SVM optimization); the quantum device just evaluates the kernel.

The details

Given classical data xx, a quantum feature map ϕ(x)\phi(x) prepares the state ϕ(x)=U(x)0|\phi(x)\rangle = U(x)|0\rangle using a parameterized circuit U(x)U(x). The quantum kernel is defined as:

k(x,x)=ϕ(x)ϕ(x)2k(x, x') = |\langle \phi(x') | \phi(x) \rangle|^2

This is estimated on hardware by running the circuit U(x)U(x)U^\dagger(x') U(x) on 0|0\rangle and measuring the probability of getting back 0|0\rangle. Repeating this for every pair of training and test points produces the kernel matrix Kij=k(xi,xj)K_{ij} = k(x_i, x_j), which is then passed to a classical SVM solver.

A popular choice for U(x)U(x) is an IQP (Instantaneous Quantum Polynomial) circuit, where data-encoding ZZ rotations alternate with fixed entangling layers. Havlicek et al. (2019, Nature) proposed this construction and gave a complexity-theoretic argument that the resulting kernel is hard to compute classically, under assumptions related to the hardness of IQP circuit simulation.

Here is a PennyLane example computing a 2×22 \times 2 kernel matrix for two-dimensional data:

import pennylane as qml
import numpy as np

dev = qml.device("default.qubit", wires=2)

def feature_map(x):
    qml.AngleEmbedding(x, wires=[0, 1])
    qml.CNOT(wires=[0, 1])
    qml.AngleEmbedding(x, wires=[0, 1])

@qml.qnode(dev)
def kernel_circuit(x1, x2):
    feature_map(x1)
    qml.adjoint(feature_map)(x2)
    return qml.probs(wires=[0, 1])

def quantum_kernel(x1, x2):
    return kernel_circuit(x1, x2)[0]  # probability of |00>

x_train = np.array([[0.5, 1.2], [0.9, 0.3]])
K = np.array([[quantum_kernel(a, b) for b in x_train] for a in x_train])
print(K)

Why it matters for learners

Quantum kernels represent one of the most theoretically grounded proposals for near-term quantum machine learning. Unlike variational quantum eigensolvers, which require gradient-based training on quantum hardware, kernel methods push all optimization to the classical side. This avoids the barren plateau problem that plagues deep variational circuits.

The critical open question is whether any quantum kernel provides a genuine advantage on real-world data, not just on artificially constructed datasets. Current evidence suggests that for problems where quantum kernels might excel, the datasets themselves are exponentially large or must be loaded via quantum RAM, which does not yet exist.

Understanding quantum kernels requires familiarity with quantum circuits and gates, plus the classical theory of kernel SVMs. This makes it a good bridge topic between quantum computing and classical machine learning curricula.

Common misconceptions

Misconception 1: Quantum kernels are automatically better than classical kernels. There is no known natural dataset where a quantum kernel has been shown to outperform classical kernels. The constructions that provably separate quantum from classical kernels require datasets engineered around hard problems. On standard benchmarks (MNIST, CIFAR, etc.), classical kernels perform at least as well with a fraction of the computational cost.

Misconception 2: The kernel matrix is computed once and stored cheaply. Each kernel entry requires a separate quantum circuit evaluation. For nn training points, the kernel matrix requires O(n2)O(n^2) circuit runs. For datasets of meaningful size, this hardware cost is prohibitive on current devices.

Misconception 3: A quantum circuit that is hard to simulate classically gives a useful quantum kernel. Hardness of simulation and usefulness for machine learning are orthogonal properties. A kernel can be uncomputable classically yet still fail to capture any structure relevant to a learning task. Circuit hardness is a necessary but not sufficient condition for practical advantage.

See also