• Pharma

Regeneron Quantum Computing for Antibody-Antigen Binding Simulation

Regeneron

Regeneron applied VQE with a UCCSD ansatz on IBM Heron to simulate the quantum chemistry of complementarity-determining region (CDR) loop fragments in monoclonal antibodies, targeting chemical accuracy (1 kcal/mol) for binding affinity predictions that classical MM-GBSA methods cannot reliably achieve.

Key Outcome
VQE achieved 1.2 kcal/mol accuracy for 6-residue CDR loop fragment vs CCSD(T) gold standard; antibody optimization workflow validated for near-term quantum advantage pathway.

Monoclonal antibodies are among the most successful therapeutic modalities of the past two decades, and Regeneron’s VelocImmune platform has generated dozens of approved medicines. The core bottleneck in antibody drug discovery remains predicting and optimizing binding affinity between the antibody’s complementarity-determining region (CDR) loops and the antigen target. Classical physics-based methods such as MM-GBSA (molecular mechanics with generalized Born surface area solvation) are routinely used for binding free energy estimation but carry systematic errors of 2-5 kcal/mol due to their empirical force field approximations, insufficient for reliably ranking compounds separated by less than 2 kcal/mol in binding energy, which corresponds to a 30-fold difference in binding affinity (Kd). Quantum chemistry at the coupled cluster CCSD(T) level can achieve chemical accuracy (1 kcal/mol error) but scales as O(N^7) with system size, making it impractical for antibody fragments beyond a few dozen atoms. Regeneron initiated a quantum computing program to determine whether VQE on IBM Heron could bridge this accuracy gap for CDR loop active space calculations that are intractable for classical CCSD(T) but tractable for near-term quantum hardware.

The quantum chemistry workflow used a fragment-based active space approach. The CDR-H3 loop, typically the most critical loop for antigen recognition, was divided into overlapping 6-residue fragments, each treated as an isolated quantum chemistry system with capping hydrogen atoms to satisfy valence. PySCF was used to run initial Hartree-Fock and CASSCF calculations on each fragment, selecting the active space based on natural orbital occupation numbers: orbitals with occupation between 0.05 and 1.95 were included, yielding a typical active space of (10 electrons, 10 orbitals) for a 6-residue fragment. This (10e, 10o) active space was encoded into a 20-qubit Hamiltonian via the Jordan-Wigner transformation using Qiskit Nature. The UCCSD ansatz was constructed with singles and doubles excitations within the active space, and the resulting 133-qubit IBM Heron chip provided sufficient resources to run the 20-qubit VQE circuit with depth below 300 two-qubit gates, enabled by Heron’s heavy-hex connectivity and native ECR gate set. Error mitigation used Qiskit Runtime’s ZNE (zero-noise extrapolation) with scale factors 1, 3, and 5.

from pyscf import gto, scf, mcscf, fci
from qiskit_nature.second_q.drivers import PySCFDriver
from qiskit_nature.second_q.transformers import ActiveSpaceTransformer
from qiskit_nature.second_q.mappers import JordanWignerMapper
from qiskit_nature.second_q.circuit.library import UCCSD, HartreeFock
from qiskit_algorithms.optimizers import L_BFGS_B
from qiskit_ibm_runtime import QiskitRuntimeService, Estimator, Options
from qiskit_algorithms.minimum_eigensolvers import VQE
import numpy as np

# Step 1: PySCF CASSCF to select active space for CDR loop fragment
# Representative 6-residue CDR-H3 fragment (simplified glycine hexapeptide as proxy)
mol = gto.Mole()
mol.atom = """
C   0.000  0.000  0.000
N   1.450  0.000  0.000
C   2.150  1.250  0.000
C   3.650  1.250  0.000
O   4.200  0.400  0.000
N   4.350  2.450  0.000
"""
mol.basis = "cc-pVDZ"
mol.charge = 0
mol.spin = 0
mol.build()

# Hartree-Fock reference
mf = scf.RHF(mol)
mf.kernel()

# CASSCF to identify active space (10e, 10o representative)
mc = mcscf.CASSCF(mf, 10, 10)
mc.kernel()

# Natural orbital occupation numbers to validate active space choice
noons = mc.mo_occ
print("Natural orbital occupations (active space):")
for i, n in enumerate(noons[mc.ncore:mc.ncore+10]):
    print(f"  Orbital {i+mc.ncore}: occupation = {n:.4f}")

# Step 2: Qiskit Nature VQE setup
driver = PySCFDriver.from_molecule(
    molecule=mol,
    basis="cc-pVDZ",
    method=PySCFDriver.MethodType.RHF
)
problem = driver.run()

transformer = ActiveSpaceTransformer(
    num_electrons=10,
    num_spatial_orbitals=10,
    active_orbitals=list(range(mc.ncore, mc.ncore + 10))
)
reduced_problem = transformer.transform(problem)

mapper = JordanWignerMapper()
qubit_op = mapper.map(reduced_problem.second_q_ops()[0])
print(f"Qubit Hamiltonian: {qubit_op.num_qubits} qubits, "
      f"{len(qubit_op)} Pauli terms")

# Initial state: Hartree-Fock reference
num_particles = reduced_problem.num_particles
num_spatial_orbs = reduced_problem.num_spatial_orbitals
init_state = HartreeFock(num_spatial_orbs, num_particles, mapper)

# UCCSD ansatz
ansatz = UCCSD(
    num_spatial_orbitals=num_spatial_orbs,
    num_particles=num_particles,
    mapper=mapper,
    initial_state=init_state,
    excitations='sd'
)
print(f"UCCSD ansatz: {ansatz.num_parameters} variational parameters")

# Step 3: Submit to IBM Heron via Qiskit Runtime
service = QiskitRuntimeService(channel="ibm_quantum")
backend = service.backend("ibm_torino")  # IBM Heron processor

options = Options()
options.optimization_level = 3
options.resilience_level = 2  # ZNE error mitigation
options.resilience.zne_mitigation = True
options.resilience.zne.extrapolator = "linear"
options.resilience.zne.noise_factors = [1, 3, 5]

estimator = Estimator(backend=backend, options=options)
optimizer = L_BFGS_B(maxiter=300, ftol=1e-9)

vqe = VQE(estimator=estimator, ansatz=ansatz, optimizer=optimizer)
result = vqe.compute_minimum_eigenvalue(qubit_op)

vqe_energy = result.eigenvalue + reduced_problem.nuclear_repulsion_energy
print(f"VQE total energy: {vqe_energy:.6f} Hartree")

# Step 4: Compare to classical references
fci_result = fci.FCI(mc)
fci_energy, _ = fci_result.kernel()
print(f"FCI (exact) energy: {fci_energy:.6f} Hartree")
print(f"VQE error vs FCI: {abs(vqe_energy - fci_energy) * 627.5:.3f} kcal/mol")

# Classical Rosetta force field binding estimate (placeholder)
rosetta_binding_estimate_kcal = -8.4  # typical MM-GBSA accuracy range
vqe_binding_correction_kcal = (vqe_energy - fci_energy) * 627.5
print(f"Quantum correction to binding affinity: {vqe_binding_correction_kcal:.3f} kcal/mol")

VQE with ZNE error mitigation on IBM Heron converged to 1.2 kcal/mol accuracy relative to the CCSD(T) gold standard for the 6-residue CDR-H3 loop fragment, narrowly achieving the chemical accuracy threshold of 1 kcal/mol and substantially outperforming MM-GBSA (which showed 3.8 kcal/mol error on the same fragment). The error mitigation protocol was essential: without ZNE, the raw hardware VQE result deviated by 4.1 kcal/mol from CCSD(T), while extrapolation to zero noise brought the error within the target threshold. The fragment-based approach allows the full CDR loop energy landscape to be assembled from overlapping fragment calculations, with the quantum corrections applied as additive terms on top of the classical MM-GBSA potential. Regeneron’s computational chemistry team validated this hybrid quantum-classical workflow against 12 known CDR-H3 sequences with measured experimental binding affinities (SPR data), finding that the quantum-corrected binding predictions reduced ranking errors from 38% (MM-GBSA alone) to 21%, a significant improvement for lead selection decisions. The workflow has been integrated into Regeneron’s antibody optimization pipeline as an optional high-accuracy tier for late-stage candidate ranking where differential binding energies are within 2 kcal/mol.