Data Encoding: Quantum Computing Glossary

Data Encoding

Data encoding (or quantum feature maps) refers to the methods used to embed classical data into quantum states, a critical step in quantum machine learning that determines what patterns a quantum model can represent.

Quantum machine learning requires converting classical data (numbers, vectors, images) into quantum states that a circuit can process. The choice of encoding strategy directly affects circuit depth, expressibility, and the computational resources needed.

Angle Encoding

The most common approach: load each classical feature into a rotation gate angle.

import pennylane as qml
import numpy as np

dev = qml.device("default.qubit", wires=3)

@qml.qnode(dev)
def angle_encoding(x):
    # Each feature maps to a qubit rotation
    qml.RY(x[0], wires=0)
    qml.RY(x[1], wires=1)
    qml.RY(x[2], wires=2)
    return qml.state()

x = np.array([0.5, 1.2, 0.8])
state = angle_encoding(x)

A 3-feature vector requires 3 qubits. The circuit depth is $O(n)$ for $n$ features.

Amplitude Encoding

Encodes $n$ features into the amplitudes of a quantum state, requiring only $\log_2 n$ qubits. A vector $x = (x_0, x_1, \ldots, x_{N-1})$ is normalized and loaded as:

$|\psi\rangle = \frac{1}{\|x\|} \sum_i x_i |i\rangle$

This is exponentially qubit-efficient but requires deep state preparation circuits that are expensive in practice.

@qml.qnode(dev)
def amplitude_encoding(x):
    # Normalize and encode 4 features into 2 qubits
    x_normalized = x / np.linalg.norm(x)
    qml.AmplitudeEmbedding(x_normalized, wires=[0, 1])
    return qml.state()

Basis Encoding

Encodes binary strings directly into computational basis states. The integer 5 (binary 101) maps to $|101\rangle$ . Simple but wastes the continuous nature of quantum amplitudes.

IQP Encoding (Dense Angle Encoding)

Applies data-dependent single-qubit rotations followed by entangling layers repeatedly, creating a highly non-linear feature map:

@qml.qnode(dev)
def iqp_encoding(x):
    for i in range(3):
        qml.RZ(x[i], wires=i)
    qml.broadcast(qml.CNOT, wires=[0, 1, 2], pattern="chain")
    for i in range(3):
        qml.RZ(x[i] ** 2, wires=i)
    return [qml.expval(qml.PauliZ(i)) for i in range(3)]

This is a common approach for quantum kernel methods, where the encoding circuit defines the kernel implicitly.

Practical Tradeoffs

Method	Qubits needed	Circuit depth	Expressibility
Angle encoding	$n$ (one per feature)	$O(n)$	Moderate
Amplitude encoding	$\log_2 n$	$O(n)$ deep prep	High
Basis encoding	$\log_2 n$	$O(\log n)$	Low
IQP/repeated	$n$	$O(n \cdot \text{layers})$	High

For near-term hardware, angle encoding is standard due to its shallow circuit depth. Amplitude encoding is theoretically powerful but the state preparation overhead often negates the qubit savings.

Connection to Barren Plateaus

The encoding strategy affects trainability. Encoding that creates highly entangled states can push circuits into barren plateau regimes where gradients vanish exponentially with system size.