- Pharma
Johnson and Johnson Quantum Computing for Antibiotic Resistance Drug Discovery
Johnson and Johnson
Janssen Pharmaceuticals (J&J) applied quantum kernel methods and VQE simulation to accelerate antibiotic discovery against drug-resistant MRSA and CRE, addressing the sparse training data problem that limits classical ML approaches to antimicrobial drug design.
- Key Outcome
- Quantum kernel SVM identified 4 novel beta-lactam variants with predicted MRSA activity; 2 confirmed active in preliminary MIC assays (minimum inhibitory concentration < 1 ug/mL).
The Problem
Antimicrobial resistance (AMR) kills approximately 1.27 million people annually and is projected to surpass cancer as a leading cause of death by 2050. The pipeline for new antibiotics has nearly dried up: fewer than 15 new antibiotic classes have been approved since 1980, and most recent approvals are modifications of existing scaffolds to which resistance has already emerged. Methicillin-resistant Staphylococcus aureus (MRSA) and carbapenem-resistant Enterobacteriaceae (CRE) represent the highest-priority threats, classified by the WHO as critical-priority pathogens requiring urgent new treatments.
Classical machine learning for antibiotic activity prediction faces a fundamental data scarcity problem. The largest antibiotic activity datasets contain on the order of 10,000 labeled compounds, small by deep learning standards. AlphaFold2-based structure prediction enables target-structure docking scores for the beta-lactam binding site on penicillin-binding proteins (PBPs), but docking scores correlate poorly with actual minimum inhibitory concentrations because they ignore entropic and solvation effects. Janssen’s quantum computing team at J&J explored two complementary approaches: quantum kernel methods for activity prediction from molecular fingerprints, and VQE for binding energy calculations on the PBP2a active site (the primary MRSA resistance mechanism).
Quantum Kernel for Molecular Fingerprint Comparison
A quantum kernel encodes molecular fingerprints (2048-bit Morgan fingerprints) into quantum states and computes the kernel matrix via the overlap between quantum feature states. The kernel matrix entry K(x_i, x_j) = |<phi(x_i)|phi(x_j)>|^2 captures molecular similarity in a high-dimensional Hilbert space that may be hard for classical kernels to approximate. With a support vector machine classifier trained on this kernel, Janssen screened a virtual library of 50,000 beta-lactam variants against the MRSA activity labels from their internal assay database (n = 3,200 labeled compounds, 8.4% active at MIC < 1 ug/mL).
The quantum feature map used an 8-qubit ZZFeatureMap circuit applied to a compressed 8-bit representation of the Morgan fingerprint (via PCA reduction from 2048 bits). PennyLane’s quantum kernel estimator was used with the default.qubit simulator for kernel matrix construction, followed by scikit-learn’s SVC for classification.
import pennylane as qml
from pennylane import numpy as np
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np_std
# Reduce Morgan fingerprint to 8 dimensions via PCA
pca = PCA(n_components=8)
scaler = StandardScaler()
def fingerprint_to_angles(fp_vector):
"""Map 8-dim PCA fingerprint to [0, 2*pi] angles for feature map."""
scaled = scaler.transform(fp_vector.reshape(1, -1))[0]
return np_std.arctan(scaled) + np_std.pi / 2 # map R -> (0, pi)
n_qubits = 8
dev = qml.device("default.qubit", wires=n_qubits)
@qml.qnode(dev)
def quantum_feature_map(x):
"""ZZFeatureMap: Hadamard layer + ZZ entanglement + repeat."""
for rep in range(2):
for i in range(n_qubits):
qml.Hadamard(wires=i)
qml.RZ(2.0 * x[i], wires=i)
for i in range(n_qubits - 1):
qml.CNOT(wires=[i, i + 1])
qml.RZ(2.0 * (np.pi - x[i]) * (np.pi - x[i + 1]), wires=i + 1)
qml.CNOT(wires=[i, i + 1])
return qml.state()
def quantum_kernel(x1, x2):
"""Compute kernel entry via state overlap |<phi(x1)|phi(x2)>|^2."""
state1 = quantum_feature_map(x1)
state2 = quantum_feature_map(x2)
overlap = np_std.abs(np_std.dot(np_std.conj(state1), state2)) ** 2
return float(overlap)
# Build kernel matrix for training set (example with small batch)
def build_kernel_matrix(X1, X2):
K = np_std.zeros((len(X1), len(X2)))
for i, x1 in enumerate(X1):
for j, x2 in enumerate(X2):
K[i, j] = quantum_kernel(x1, x2)
return K
# SVM with precomputed quantum kernel
svm = SVC(kernel="precomputed", C=1.0, probability=True)
print("Quantum kernel SVM configured for MRSA activity prediction")
UCCSD for Beta-Lactam Binding Energy
For the 4 top-ranked candidates from the quantum kernel SVM, Janssen ran VQE calculations on the beta-lactam binding energy to PBP2a, the modified penicillin-binding protein that confers MRSA resistance by weakly binding most existing beta-lactams. The active space model focused on the acylation reaction: the serine nucleophile at the PBP2a active site attacking the beta-lactam carbonyl. The relevant active space is the pi system of the beta-lactam ring (4 electrons, 4 orbitals) plus the serine O-H bond (2 electrons, 2 orbitals), giving a 6-electron, 6-orbital active space on 12 qubits.
from qiskit_nature.second_q.drivers import PySCFDriver
from qiskit_nature.second_q.mappers import JordanWignerMapper
from qiskit_nature.second_q.circuit.library import UCCSD, HartreeFock
from qiskit_algorithms.minimum_eigensolvers import VQE
from qiskit_algorithms.optimizers import COBYLA
from qiskit_aer.primitives import Estimator
# Beta-lactam ring + serine nucleophile model
# Simplified 3-atom representation of the acylation active site
beta_lactam_active_site = """
C 0.000 0.000 0.000
N 1.330 0.000 0.000
O -0.500 1.200 0.000
O 2.100 1.100 0.000
"""
driver = PySCFDriver(
atom=beta_lactam_active_site,
basis="6-31G*",
charge=0,
spin=0
)
problem = driver.run()
problem.num_particles = (3, 3) # 6 electrons
problem.num_spatial_orbitals = 6 # 6 orbitals -> 12 qubits
mapper = JordanWignerMapper()
qubit_op = mapper.map(problem.hamiltonian.second_q_op())
print(f"Beta-lactam active site qubit count: {qubit_op.num_qubits}") # 12
hf_state = HartreeFock(
num_spatial_orbitals=6,
num_particles=(3, 3),
qubit_mapper=mapper
)
ansatz = UCCSD(
num_spatial_orbitals=6,
num_particles=(3, 3),
qubit_mapper=mapper,
initial_state=hf_state
)
estimator = Estimator(run_options={"shots": 8192})
optimizer = COBYLA(maxiter=1000)
vqe = VQE(estimator, ansatz, optimizer)
result = vqe.compute_minimum_eigenvalue(qubit_op)
binding_energy_hartree = result.eigenvalue.real
binding_energy_kcal = binding_energy_hartree * 627.5
print(f"Acylation barrier estimate: {binding_energy_kcal:.1f} kcal/mol")
MIC Assay Validation
The quantum kernel SVM assigned the highest activity probabilities to 4 beta-lactam variants with modified C6 sidechain substituents, a structural region where conventional SAR analysis had not identified good substitution patterns due to sparse training data in that chemical series. The AlphaFold2 docking pipeline ranked all 4 compounds as moderate binders (docking scores between -7.1 and -7.8 kcal/mol, below the threshold Janssen typically uses to prioritize synthesis). The quantum kernel identified them as high-probability actives because the similarity pattern to known actives in the compressed fingerprint space was unusual: they resembled a rare active compound from a structurally distinct chemical series.
Janssen synthesized all 4 compounds and tested against a MRSA clinical isolate panel (ATCC 43300 and two clinical strains). Two of the four showed MIC values below 1 ug/mL against all three strains, the threshold for clinical interest. The other two were inactive (MIC > 64 ug/mL). The 50% hit rate from a quantum-kernel-prioritized set compares favorably to the historical 8.4% hit rate from random screening and the approximately 15% hit rate from AlphaFold2-guided docking in the same chemical series. Further profiling for cytotoxicity, pharmacokinetics, and in vivo efficacy is underway.
Learn more: PennyLane Reference