- Finance
Mastercard: Quantum Kernel Methods for Fraud Detection
Mastercard
Mastercard explored quantum kernel methods and quantum support vector machines for credit card fraud detection, testing whether quantum feature maps could find structure in transaction data that classical kernels miss.
- Key Outcome
- Quantum kernel methods matched but did not beat classical SVM with RBF kernel on tested datasets. The research identified that quantum advantage in this domain requires genuinely high-dimensional quantum-structured data. Mastercard continues research into quantum feature map design.
The Problem
Credit card fraud costs the global payments industry tens of billions of dollars per year. Mastercard processes billions of transactions and flags suspicious ones in real time, typically within milliseconds.
Current fraud detection relies on classical machine learning: gradient boosted trees, neural networks, and ensemble methods trained on historical labeled transactions. These work well but have limitations. Retraining is expensive when fraud patterns shift. High-dimensional feature interactions are hard for linear models to capture. Novel fraud patterns that differ structurally from the training set can go undetected.
The quantum question: do quantum kernel methods, which compute similarity in an exponentially large Hilbert space, find structure in transaction data that classical kernels miss?
Quantum Kernel Methods
A kernel method computes a similarity function between data points and uses those similarities to train a classifier without explicitly working in the high-dimensional feature space. The most common example is the SVM with RBF kernel.
A quantum kernel replaces the classical similarity function with one computed by a quantum circuit. Transaction features are encoded into a quantum state. The inner product between two such states gives the kernel value. Classical SVM training then uses these quantum kernel values.
import pennylane as qml
import numpy as np
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
# Simulate with 4 features (in practice: amount, time, merchant category, etc.)
n_qubits = 4
dev = qml.device("default.qubit", wires=n_qubits)
def feature_map(x):
"""Encode transaction features into quantum state."""
# Layer 1: Hadamard + Rx rotations
for i in range(n_qubits):
qml.Hadamard(wires=i)
qml.RZ(x[i], wires=i)
# Layer 2: Entangling layer with feature interactions
for i in range(n_qubits - 1):
qml.CNOT(wires=[i, i + 1])
qml.RZ(x[i] * x[i + 1], wires=i + 1)
# Layer 3: Second rotation
for i in range(n_qubits):
qml.RZ(x[i], wires=i)
@qml.qnode(dev)
def kernel_circuit(x1, x2):
"""Compute quantum kernel value between two data points."""
feature_map(x1)
qml.adjoint(feature_map)(x2)
return qml.probs(wires=range(n_qubits))
def quantum_kernel(x1, x2):
"""Kernel value is probability of measuring all zeros."""
probs = kernel_circuit(x1, x2)
return float(probs[0]) # |000...0> state probability
def build_kernel_matrix(X1, X2):
"""Build full Gram matrix for SVM training."""
n1, n2 = len(X1), len(X2)
K = np.zeros((n1, n2))
for i in range(n1):
for j in range(n2):
K[i, j] = quantum_kernel(X1[i], X2[j])
return K
# Generate synthetic fraud data (balanced sample for demo)
np.random.seed(42)
n_samples = 50
X_train = np.random.randn(n_samples, n_qubits)
y_train = np.array([1] * 25 + [-1] * 25) # 1=fraud, -1=legitimate
# Normalize features to fit quantum encoding range
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train) * np.pi / 2
# Build kernel matrix and train SVM
K_train = build_kernel_matrix(X_train_scaled, X_train_scaled)
svm = SVC(kernel="precomputed", C=1.0)
svm.fit(K_train, y_train)
# Predict on new transactions
X_test = np.random.randn(5, n_qubits)
X_test_scaled = scaler.transform(X_test) * np.pi / 2
K_test = build_kernel_matrix(X_test_scaled, X_train_scaled)
predictions = svm.predict(K_test)
print("Fraud predictions:", predictions)
The Kernel Advantage Hypothesis
The theoretical motivation for quantum kernels is the “quantum feature space” argument. A d-qubit system can represent vectors in a 2^d dimensional Hilbert space. A quantum kernel implicitly computes inner products in this space. For d=20 qubits, the feature space has over a million dimensions, which would be computationally intractable classically.
If fraud patterns have structure in this high-dimensional space that classical kernels cannot efficiently approximate, quantum kernels could give better classification. The challenge is that nobody knows in advance whether transaction data has this quantum structure.
Results
Mastercard tested quantum kernels on fraud datasets with several hundred features (reduced to the number of qubits via PCA). Findings:
- On datasets up to a few hundred training samples, quantum kernels matched RBF kernel SVM accuracy
- No dataset tested showed a clear quantum advantage in classification performance
- The quantum kernel computation was dramatically slower (seconds per kernel evaluation vs. microseconds for RBF)
- Feature map design strongly influenced results, but no principled way to choose the optimal map was found
This matches results from other quantum kernel experiments in the literature. The “quantum advantage” hypothesis for kernel methods requires either that the data has a specific quantum-favorable structure, or that the quantum feature map is deliberately engineered for the problem domain.
What Would Change the Picture
Mastercard’s researchers identified two conditions that could make quantum kernels practically useful:
- Quantum-native data: If transaction data were generated from a quantum process (or had a known quantum structure), the feature map could be designed to exploit it
- Inductive bias: Better theoretical tools for choosing feature maps that encode domain knowledge about fraud patterns
The work is not a failure. It is a necessary calibration of expectations. Understanding where quantum ML does not help is as valuable as finding where it does.
Framework
Mastercard’s research used PennyLane for quantum circuit construction and kernel computation, integrated with scikit-learn’s SVC for classical SVM training. All experiments ran on classical simulators.
Learn more: PennyLane Reference