Hello World with TensorFlow Quantum

TensorFlow Quantum (TFQ) is Google’s framework for hybrid quantum-classical machine learning. Cirq builds the quantum circuits; TensorFlow/Keras handles training and optimization. A parameterized quantum circuit becomes a Keras layer. Gradients flow through it via the parameter-shift rule, and you train the whole model with model.fit() like any other neural network.

What TFQ Is and Where It Fits

TFQ is built for hybrid models: circuits whose parameters are trained by gradient descent to minimize a loss function. The quantum layer outputs expectation values (real numbers) that feed into classical layers or directly into a loss.

If your team already uses TensorFlow and Keras, TFQ integrates naturally. If you use PyTorch or want to target multiple hardware backends, PennyLane is a better fit. See Comparing TFQ and PennyLane.

TFQ Architecture: Core Components

TFQ provides several layer types and utilities. Understanding when to use each one saves you from debugging mysterious shape errors and silent failures.

tfq.layers.PQC (Parameterized Quantum Circuit) is the main workhorse. It accepts a Cirq circuit with sympy.Symbol parameters and one or more observables. During the forward pass, TFQ resolves the symbols with trainable TensorFlow variables and returns expectation values. Use PQC when you want an end-to-end trainable quantum layer with automatic gradient computation.

tfq.layers.Expectation computes expectation values but gives you more control than PQC. You pass circuits, symbol names, symbol values, and operators as separate inputs. This is useful when you need to vary the circuit structure across batches or when the parameter values come from a preceding classical layer rather than from the layer’s own weights.

tfq.layers.Sample returns raw bitstring samples instead of expectation values. Each circuit evaluation produces n_shots binary strings. Use Sample for combinatorial optimization problems (like QAOA) where you need discrete solutions, not continuous gradients.

tfq.convert_to_tensor([circuit]) serializes Cirq circuits into TensorFlow string tensors. This is how quantum circuits enter the TF computational graph. The serialized form contains the circuit structure but not the parameter values; those are injected at runtime by the layer.

tfq.differentiators provides gradient computation methods. The default is ParameterShift, which computes exact analytic gradients. Other options trade accuracy for speed. We cover these in detail in the differentiator comparison section below.

Installation and Version Compatibility

TFQ requires a specific TensorFlow version, and the pairing is strict. As of mid-2026, the official install guide pins TensorFlow 2.19.1 with the tf-keras compatibility package, and you must set TF_USE_LEGACY_KERAS=1 before importing so TensorFlow uses legacy Keras instead of Keras 3. Mismatched versions are the most common install error.

pip install tensorflow==2.19.1 tf-keras==2.19.0
pip install -U tensorflow-quantum
export TF_USE_LEGACY_KERAS=1

Older tutorials show tensorflow==2.11.0 with TFQ 0.7.3; that pairing is outdated. Check tensorflow.org/quantum/install for the currently supported versions.

This tutorial assumes you can read a basic Cirq circuit. If Cirq is new to you, start with the Cirq reference first.

Step 1: Create a Parameterized Cirq Circuit

Gates in Cirq accept sympy.Symbol objects as parameters. These become the trainable weights of your quantum layer.

import cirq
import sympy

qubit = cirq.GridQubit(0, 0)
theta = sympy.Symbol('theta')

circuit = cirq.Circuit(cirq.ry(theta)(qubit))
print(circuit)
# (0, 0): ---Ry(theta)---

How TFQ Resolves Sympy Symbols

When you pass a parameterized circuit to a TFQ layer, something specific happens under the hood. TFQ extracts all sympy.Symbol objects from the circuit and creates a mapping from symbol names to TensorFlow trainable variables. During each forward pass, TFQ calls Cirq’s simulator with a parameter resolver that substitutes the current variable values for the symbolic placeholders.

You can inspect the symbols TFQ finds in a circuit:

import tensorflow_quantum as tfq

# Extract all symbol names from a circuit
symbol_names = sorted(
    tfq.util.get_circuit_symbols(circuit),
    key=lambda x: x.name
)
print(symbol_names)  # [theta]

# Internally, TFQ does something equivalent to:
# cirq.Simulator().simulate(circuit, param_resolver=cirq.ParamResolver({'theta': 0.5}))
# but with TF tensors instead of floats

This symbol resolution mechanism means you define circuits symbolically with sympy and let TFQ handle the numeric substitution. You never need to manually resolve parameters during training.

Step 2: Convert to a TFQ Tensor

TFQ serializes circuits into string tensors so TensorFlow can treat them as data.

import tensorflow_quantum as tfq

circuit_tensor = tfq.convert_to_tensor([circuit])
print(circuit_tensor.shape)  # (1,) - a batch of one circuit

An important subtlety: tfq.convert_to_tensor serializes the circuit structure without baking in parameter values. The serialized tensor contains the gate types, qubit placements, and the names of the sympy symbols, but no numeric values. The PQC layer injects its trainable weights at runtime. This means you can serialize a circuit once and reuse the tensor across many forward passes with different parameter values.

Step 3: Wrap in a PQC Layer and Build the Model

tfq.layers.PQC takes a circuit and an observable, returning expectation values. Here the observable is Pauli-Z.

# Requires: tensorflow_quantum
import tensorflow as tf

readout = cirq.Z(qubit)  # measure expectation of Pauli-Z

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(), dtype=tf.string),
    tfq.layers.PQC(circuit, readout),
])

# Forward pass before training
output = model(circuit_tensor)
print(output.numpy())  # e.g. [[1.0]] - <Z> for |0> is 1.0
# After training theta to π, state becomes |1> and <Z> approaches -1.0

Step 4: Train the Model

The input to a TFQ model is a tensor of circuits; the target is the expectation value you want.

# Requires: tensorflow_quantum
import numpy as np

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.2),
    loss=tf.keras.losses.MeanSquaredError(),
)

x_train = tfq.convert_to_tensor([circuit] * 50)
y_train = np.full((50, 1), -1.0, dtype=np.float32)  # target <Z> = -1.0

history = model.fit(x_train, y_train, epochs=40, verbose=0)
print(f"Final loss: {history.history['loss'][-1]:.4f}")
# Loss approaches 0 as theta trains to π

trained_theta = model.layers[0].get_weights()[0]
print(f"Learned theta: {trained_theta[0]:.4f}")  # near 3.14

Multiple Qubits and Observables

Real quantum ML models use more than one qubit. When you specify multiple observables, the PQC layer returns one expectation value per observable, giving you a multi-dimensional output from a single quantum circuit.

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq
import cirq, sympy, numpy as np

q0, q1, q2 = [cirq.GridQubit(0, i) for i in range(3)]
theta, phi = sympy.symbols('theta phi')

circuit = cirq.Circuit([
    cirq.ry(theta)(q0),
    cirq.CNOT(q0, q1),
    cirq.rz(phi)(q1),
    cirq.CNOT(q1, q2),
])

# Three observables measured on the same output state
observables = [cirq.Z(q0), cirq.Z(q1), cirq.Z(q2)]

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(), dtype=tf.string),
    tfq.layers.PQC(circuit, observables),
])

circuit_tensor = tfq.convert_to_tensor([circuit])
output = model(circuit_tensor)
print(output.shape)   # (1, 3) -- one expectation value per observable
print(output.numpy()) # e.g. [[0.98, 0.95, 0.98]] before training

Each observable is evaluated independently on the same final quantum state. The circuit runs once, and TFQ computes <psi|Z_0|psi>, <psi|Z_1|psi>, and <psi|Z_2|psi> from the result. This multi-observable pattern is essential for building classifiers with more than two classes.

Data Encoding Strategies

A core challenge in quantum ML is encoding classical data into quantum states. The encoding method determines how many qubits you need, how deep the circuit is, and what functions the model can learn. There are three main approaches.

Angle Encoding

Angle encoding maps each classical feature to a rotation angle on a dedicated qubit. One feature per qubit, applied as a single-qubit rotation gate.

# Requires: tensorflow_quantum
import cirq, numpy as np

qubits = [cirq.GridQubit(0, i) for i in range(3)]

def angle_encode(features):
    """Encode 3 features as Ry rotations on 3 qubits."""
    return cirq.Circuit([
        cirq.ry(float(features[i]))(qubits[i])
        for i in range(3)
    ])

# Example: encode a 3-feature sample
sample = np.array([0.5, 1.2, 2.8])
encoded = angle_encode(sample)
print(encoded)

Angle encoding is simple and shallow (depth 1 for the encoding portion), but it requires one qubit per feature. For a dataset with 100 features, you need 100 qubits. Use angle encoding for small feature spaces or when qubit count is not the bottleneck.

Important: features should be scaled to the range [0, pi] before encoding. Rotation gates are periodic with period 2*pi, so large feature values wrap around and destroy the encoding. A feature value of 7.0 produces the same rotation as 7.0 - 2*pi = 0.72, which maps two very different inputs to similar quantum states.

Amplitude Encoding

Amplitude encoding stores n features in the amplitudes of log2(n) qubits. A normalized feature vector [a_0, a_1, ..., a_{n-1}] becomes the quantum state a_0|0...0> + a_1|0...1> + ... + a_{n-1}|1...1>. This is exponentially compact: 256 features fit in 8 qubits.

The catch is that preparing an arbitrary amplitude state requires a circuit of depth O(2^n), which negates the qubit savings. Amplitude encoding is theoretically elegant but impractical for most near-term applications unless your data has special structure that allows efficient state preparation.

Basis Encoding

Basis encoding maps binary features directly to qubit states: feature 0 maps to |0>, feature 1 maps to |1>. Apply an X gate for each feature that equals 1.

# Requires: cirq
import cirq

qubits = [cirq.GridQubit(0, i) for i in range(4)]

def basis_encode(binary_features):
    """Encode binary features as computational basis states."""
    ops = []
    for i, bit in enumerate(binary_features):
        if bit == 1:
            ops.append(cirq.X(qubits[i]))
    return cirq.Circuit(ops)

encoded = basis_encode([1, 0, 1, 1])
print(encoded)
# Produces |1011>

Basis encoding is the simplest approach but only works for binary data. It requires one qubit per binary feature and produces no superposition on its own, so the trainable circuit layers must create all the quantum advantage.

How Gradients Flow Through a Quantum Circuit

TFQ uses the parameter-shift rule: evaluate the circuit at theta + pi/2 and theta - pi/2, take the difference divided by 2. This gives the exact gradient without numerical approximation. You do not need to implement it yourself; TFQ handles it during model.fit. To specify it explicitly:

# Requires: tensorflow_quantum
differentiator = tfq.differentiators.ParameterShift()
pqc_layer = tfq.layers.PQC(circuit, readout, differentiator=differentiator)

Differentiator Comparison

TFQ provides several gradient computation methods. Each trades off accuracy, speed, and hardware compatibility differently.

ParameterShift computes exact analytic gradients. For each parameter, it evaluates the circuit twice (once at theta + pi/2, once at theta - pi/2). This means a circuit with p parameters requires 2p circuit evaluations per gradient step. Use this for small circuits (under ~20 parameters) or whenever you need guaranteed gradient accuracy. It works on both simulators and real hardware.

Adjoint uses the adjoint (reverse-mode) differentiation method on the simulator. It is significantly faster than parameter-shift for simulation because it avoids the 2p overhead, instead computing all gradients in a single backward pass through the statevector. However, it only works on noiseless simulators, not on real quantum hardware or noisy simulators.

LinearCombination evaluates the circuit at parameter values perturbed around the forward-pass values and linearly combines the results. It is the general pattern that the parameter-shift and finite-difference methods are built on, and you can configure custom weights and perturbations. (Older TFQ releases also shipped a stochastic SGDifferentiator, but it was removed from the API.)

ForwardDifference computes numerical finite-difference gradients. It is the least accurate option and not recommended for production training. Use it only for debugging or sanity-checking other differentiator implementations.

# Requires: tensorflow_quantum
import tensorflow_quantum as tfq

# Using Adjoint for fast simulation
adjoint_diff = tfq.differentiators.Adjoint()
fast_pqc = tfq.layers.PQC(circuit, readout, differentiator=adjoint_diff)

# Using ParameterShift for hardware-compatible exact gradients
ps_diff = tfq.differentiators.ParameterShift()
hw_pqc = tfq.layers.PQC(circuit, readout, differentiator=ps_diff)

Shot budget consideration: on real hardware, each circuit evaluation costs a number of shots (repeated measurements). The parameter-shift rule doubles your shot budget relative to inference because each gradient requires two evaluations per parameter. For a 10-parameter circuit with 1000 shots per evaluation, one gradient step costs 20,000 shots.

Sampling vs Expectation Values

TFQ provides two fundamentally different output modes, and choosing the wrong one is a common source of confusion.

Expectation values (from PQC or Expectation layers) return the quantity <psi|O|psi> for each observable O. This is a continuous float between -1 and +1 for Pauli observables. Expectation values are differentiable, which means gradients flow through them and you can train the circuit with standard backpropagation.

Samples (from the Sample layer) return raw bitstrings from measuring the quantum state. Each circuit evaluation produces n_shots binary strings. Sampling is inherently stochastic and discrete.

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq
import cirq, sympy

qubit = cirq.GridQubit(0, 0)
theta = sympy.Symbol('theta')
circuit = cirq.Circuit(cirq.ry(theta)(qubit))

circuit_tensor = tfq.convert_to_tensor([circuit])

# Expectation: returns continuous float
exp_layer = tfq.layers.Expectation()
exp_output = exp_layer(
    circuit_tensor,
    symbol_names=[theta],
    symbol_values=[[1.57]],  # theta = pi/2
    operators=tfq.convert_to_tensor([[cirq.Z(qubit)]])
)
print(f"Expectation: {exp_output.numpy()}")  # near 0.0 for theta=pi/2

# Sample: returns bitstrings
sample_layer = tfq.layers.Sample(backend='noiseless')
sample_output = sample_layer(
    circuit_tensor,
    symbol_names=[theta],
    symbol_values=[[1.57]],
    repetitions=10
)
print(f"Samples shape: {sample_output.shape}")  # (1, 10, 1) - 10 shots, 1 qubit
print(f"Samples: {sample_output.numpy()}")      # mix of 0s and 1s

Key distinction: you cannot backpropagate through a Sample layer. Sampling produces discrete outputs, and discrete functions have zero gradient almost everywhere. If you need gradients for training, use expectation values. Use sampling only when you need the actual measurement outcomes, such as in QAOA where you want to read out candidate solutions.

A Minimal Quantum Classifier

Two-qubit angle encoding with a trainable PQC layer.

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq
import cirq, sympy, numpy as np

q0, q1 = cirq.GridQubit(0, 0), cirq.GridQubit(0, 1)
w0, w1 = sympy.Symbol('w0'), sympy.Symbol('w1')

def encode(x0, x1):
    return cirq.Circuit([
        cirq.rx(float(x0))(q0), cirq.rx(float(x1))(q1),  # angle encode features
        cirq.ry(w0)(q0), cirq.ry(w1)(q1),                  # trainable rotations
        cirq.CNOT(q0, q1),                                  # entangle
    ])

np.random.seed(42)
x_all = np.vstack([np.random.normal(0.5, 0.2, (20, 2)),
                   np.random.normal(2.0, 0.2, (20, 2))])
y_all = np.array([-1.0]*20 + [1.0]*20, dtype=np.float32).reshape(-1, 1)

x_tensor = tfq.convert_to_tensor([encode(x[0], x[1]) for x in x_all])

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(), dtype=tf.string),
    tfq.layers.PQC(encode(0, 0), cirq.Z(q1)),
])
model.compile(optimizer=tf.keras.optimizers.Adam(0.3),
              loss='mse', metrics=['mae'])
history = model.fit(x_tensor, y_all, epochs=30, verbose=0)
print(f"Final MAE: {history.history['mae'][-1]:.4f}")

Expanded Classifier: 4-Class Problem with Softmax

The minimal classifier above handles binary classification. For multi-class problems, you combine multiple quantum observables with a classical softmax layer. Here we classify 4 classes using 2 qubits and 2 observables, giving a 2D output that feeds into a 4-class softmax.

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq
import cirq, sympy, numpy as np

q0, q1 = cirq.GridQubit(0, 0), cirq.GridQubit(0, 1)

# Trainable parameters: 3 rotation angles per layer, 2 layers
params = sympy.symbols('p0:6')

def make_model_circuit():
    """Two-layer ansatz with entanglement."""
    return cirq.Circuit([
        # Layer 1
        cirq.ry(params[0])(q0), cirq.ry(params[1])(q1),
        cirq.CNOT(q0, q1),
        cirq.rz(params[2])(q1),
        # Layer 2
        cirq.ry(params[3])(q0), cirq.ry(params[4])(q1),
        cirq.CNOT(q1, q0),
        cirq.rz(params[5])(q0),
    ])

def encode_features(x0, x1):
    """Angle-encode two features."""
    return cirq.Circuit([
        cirq.rx(float(x0))(q0),
        cirq.rx(float(x1))(q1),
    ])

# Two observables: ZI and IZ give a 2D output
observables = [cirq.Z(q0), cirq.Z(q1)]

# Generate synthetic 4-class data
np.random.seed(42)
centers = [(0.5, 0.5), (0.5, 2.5), (2.5, 0.5), (2.5, 2.5)]
x_data, y_data = [], []
for label, (cx, cy) in enumerate(centers):
    points = np.random.normal(loc=[cx, cy], scale=0.3, size=(25, 2))
    x_data.append(points)
    y_data.extend([label] * 25)

x_data = np.vstack(x_data)
y_data = np.array(y_data, dtype=np.int32)

# Scale features to [0, pi] for angle encoding
x_min, x_max = x_data.min(), x_data.max()
x_scaled = (x_data - x_min) / (x_max - x_min) * np.pi

# Convert to circuit tensors
x_circuits = [encode_features(x[0], x[1]) for x in x_scaled]
x_tensor = tfq.convert_to_tensor(x_circuits)

# One-hot encode labels for softmax
y_onehot = tf.keras.utils.to_categorical(y_data, num_classes=4)

# Build model: quantum layer outputs 2 values, dense layer maps to 4 classes
model_circuit = make_model_circuit()
quantum_model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(), dtype=tf.string),
    tfq.layers.PQC(model_circuit, observables),
    tf.keras.layers.Dense(4, activation='softmax'),
])

quantum_model.compile(
    optimizer=tf.keras.optimizers.Adam(0.05),
    loss='categorical_crossentropy',
    metrics=['accuracy'],
)

# Train with validation split
history = quantum_model.fit(
    x_tensor, y_onehot,
    epochs=60, verbose=0,
    validation_split=0.2,
)
print(f"Training accuracy: {history.history['accuracy'][-1]:.3f}")
print(f"Validation accuracy: {history.history['val_accuracy'][-1]:.3f}")

This pattern (quantum circuit producing multi-dimensional output, followed by a classical softmax layer) is the standard approach for multi-class quantum classification. The quantum circuit acts as a trainable feature extractor, and the classical layer handles the final classification.

The Barren Plateau Problem

The barren plateau problem is the most significant practical limitation of variational quantum ML. As circuit depth and qubit count grow, the gradients of the loss function vanish exponentially. The cost landscape becomes exponentially flat, and gradient-based optimizers cannot find the minimum.

The mechanism is straightforward: in a randomly initialized deep quantum circuit, the output state is close to a random state in the Hilbert space. The expectation value of any local observable (like Z on a single qubit) concentrates around zero with variance that shrinks exponentially as 2^(-n) where n is the number of qubits. This means gradients also shrink exponentially, making training impossible for large circuits.

You can observe this effect directly by measuring gradient magnitudes as you increase the qubit count:

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq
import cirq, sympy, numpy as np

def measure_gradient_magnitude(n_qubits, n_layers=3):
    """Compute gradient magnitude for a random circuit of given size."""
    qubits = [cirq.GridQubit(0, i) for i in range(n_qubits)]
    symbols = []
    ops = []

    for layer in range(n_layers):
        for i, q in enumerate(qubits):
            s = sympy.Symbol(f'p_{layer}_{i}')
            symbols.append(s)
            ops.append(cirq.ry(s)(q))
        for i in range(n_qubits - 1):
            ops.append(cirq.CNOT(qubits[i], qubits[i + 1]))

    circuit = cirq.Circuit(ops)
    observable = cirq.Z(qubits[0])

    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(), dtype=tf.string),
        tfq.layers.PQC(circuit, observable),
    ])

    x = tfq.convert_to_tensor([circuit])
    y = np.array([[-1.0]], dtype=np.float32)

    with tf.GradientTape() as tape:
        pred = model(x)
        loss = tf.reduce_mean((pred - y) ** 2)
    grads = tape.gradient(loss, model.trainable_variables)
    grad_magnitude = tf.reduce_mean(tf.abs(grads[0])).numpy()
    return grad_magnitude

# Compare gradient magnitudes across qubit counts
for n in [2, 4, 6, 8]:
    mag = measure_gradient_magnitude(n)
    print(f"{n} qubits: avg gradient magnitude = {mag:.6f}")
# Gradient magnitude decreases roughly exponentially with qubit count

Current mitigations:

Layerwise training: start with a shallow circuit (one layer) and train it to convergence. Then add another layer and train again. Each stage begins with meaningful gradients because only the new layer is randomly initialized.
Structured ansatze: instead of random circuit architectures, use circuits with known inductive biases that match your problem structure. Hardware-efficient ansatze with limited entanglement tend to have milder barren plateaus.
Local observables: choose observables that act on qubits “close” to the circuit’s output. Measuring a global observable (one that acts on all qubits) makes the barren plateau worse.
Parameter initialization: initialize parameters near zero rather than uniformly at random. Small initial rotations keep the state close to the computational basis, where gradients are larger.

TFQ vs PennyLane

Both TFQ and PennyLane are frameworks for quantum ML, but they serve different communities and use cases.

Feature	TFQ	PennyLane
Integration	TensorFlow/Keras native	PyTorch, TensorFlow, JAX, NumPy
Circuit library	Cirq only	Qiskit, Braket, Cirq, native PennyLane
Hardware access	Google processors	IBM, AWS Braket, IonQ, Rigetti
Gradient methods	ParameterShift, Adjoint, LinearCombination	ParameterShift, SPSA, adjoint, backprop
Barren plateau tools	Limited	Rotosolve, quantum natural gradient
Best for	TF-native teams, Google hardware	Multi-backend research, QML research

Choose TFQ if your team already works in the TensorFlow ecosystem and you want the tightest possible integration with Keras APIs. The Sequential model pattern feels natural if you are already building classical TF models.

Choose PennyLane if you need backend flexibility (running the same code on IBM, IonQ, or Rigetti hardware), want access to more advanced optimization techniques like quantum natural gradient, or need JAX/PyTorch integration.

Saving and Loading TFQ Models

TFQ models contain custom layers that require special handling during save and load. The standard Keras model.save() works, but you must pass the custom objects when loading.

# Requires: tensorflow_quantum
import tensorflow as tf, tensorflow_quantum as tfq

# After training, save the model
model.save("qml_model")

# Load it back with custom objects
loaded_model = tf.keras.models.load_model(
    "qml_model",
    custom_objects={"PQC": tfq.layers.PQC}
)

# Verify the loaded model produces the same output
original_output = model(circuit_tensor)
loaded_output = loaded_model(circuit_tensor)
print(f"Outputs match: {np.allclose(original_output, loaded_output)}")

The circuit tensor itself is serialized as part of the layer configuration, so you do not need to save the Cirq circuit separately. The trained parameter values are stored in the standard Keras weight format.

If your model uses Expectation or Sample layers, include those in the custom objects dictionary as well:

loaded_model = tf.keras.models.load_model(
    "qml_model",
    custom_objects={
        "PQC": tfq.layers.PQC,
        "Expectation": tfq.layers.Expectation,
        "Sample": tfq.layers.Sample,
    }
)

Common Mistakes

These are the errors that trip up most TFQ beginners. Each one is easy to fix once you know about it, but confusing to debug if you do not.

Version mismatch between TF and TFQ

Each TFQ release requires a specific TensorFlow version (the current release pins TensorFlow 2.19.1 with tf-keras). Installing TFQ against a different TensorFlow causes import errors or, worse, silent incorrect results where the model appears to train but produces garbage outputs. Always pin the version pair listed on the TFQ install page:

pip install tensorflow==2.19.1 tf-keras==2.19.0 tensorflow-quantum

Remember to set TF_USE_LEGACY_KERAS=1 in your environment before importing.

Using `cirq.measure()` in PQC circuits

PQC layers compute expectation values analytically. If you add a cirq.measure() gate to a circuit passed to PQC, TFQ raises an error because measurements collapse the state and are incompatible with expectation value computation. Only use cirq.measure() in circuits intended for the Sample layer.

# WRONG: this causes an error with PQC
bad_circuit = cirq.Circuit([
    cirq.ry(theta)(qubit),
    cirq.measure(qubit, key='result'),  # do not do this with PQC
])

# CORRECT: let PQC handle measurement via the observable
good_circuit = cirq.Circuit([cirq.ry(theta)(qubit)])
pqc = tfq.layers.PQC(good_circuit, cirq.Z(qubit))  # Z is the observable

Confusing circuit serialization with parameter resolution

tfq.convert_to_tensor serializes the circuit structure, not the parameter values. A common mistake is assuming that converting a circuit with specific parameter values “bakes in” those values. It does not. The PQC layer always injects its own trainable weights at runtime, ignoring any numeric values you may have resolved before serialization.

# These two produce IDENTICAL tensors:
tensor_a = tfq.convert_to_tensor([cirq.Circuit(cirq.ry(theta)(qubit))])
resolved = cirq.resolve_parameters(cirq.Circuit(cirq.ry(theta)(qubit)),
                                    cirq.ParamResolver({'theta': 1.5}))
tensor_b = tfq.convert_to_tensor([resolved])
# tensor_a and tensor_b are the same; the 1.5 is not stored

Not scaling features for angle encoding

Rotation gates have period 2*pi. If your input features have values like 50.0 or 1000.0, the rotation wraps around many times and the encoding becomes essentially random. Always scale your features to a meaningful range before encoding:

import numpy as np

# Scale features to [0, pi]
x_min, x_max = x_data.min(axis=0), x_data.max(axis=0)
x_scaled = (x_data - x_min) / (x_max - x_min) * np.pi

The range [0, pi] is standard because it maps to a half-rotation on the Bloch sphere, giving the maximum dynamic range for the encoding. Using [0, 2*pi] also works but is redundant since ry(0) and ry(2*pi) produce the same state.

Expecting quantum advantage on classical data

TFQ is a research tool. Quantum ML models with a few qubits have roughly the same expressive power as small classical neural networks with a comparable number of parameters. A 5-qubit, 3-layer quantum circuit has about 15 trainable parameters, and a 15-parameter classical network is not impressive. Do not expect a quantum model to outperform a classical one on tabular data, image classification, or standard ML benchmarks. The credible use cases for quantum advantage involve processing quantum data (states produced by quantum systems) or exploiting specific problem structure that maps naturally onto quantum operations.

When TFQ Beats Classical

TFQ is a research tool. Evidence for quantum ML advantage is limited. The credible cases involve quantum data originating on a quantum system, where a quantum circuit can process it without exponential classical simulation cost. For classical data, TFQ rarely outperforms classical networks of comparable parameter count.

Next Steps

Full API reference: TensorFlow Quantum Reference. For framework-agnostic QML, see PennyLane for Quantum ML. Official TFQ tutorials at tensorflow.org/quantum cover quantum convolutional networks and variational quantum eigensolvers.