- Pharma
Boehringer Ingelheim: Quantum Chemistry for Drug Discovery
Boehringer Ingelheim / Google Quantum AI
Boehringer Ingelheim partnered with Google Quantum AI to simulate molecular ground states relevant to drug discovery targets using variational quantum eigensolver on Google's superconducting processors.
- Key Outcome
- Successfully simulated small drug-relevant molecular fragments and enzyme active sites using VQE on Sycamore, establishing a workflow that scales toward clinically relevant molecules as fault-tolerant hardware matures.
The Problem
Drug discovery fails at a staggering rate. Around 90% of drug candidates that enter clinical trials never reach patients. A significant fraction of those failures occur because the molecule’s interaction with its biological target turns out to be different from what computational models predicted. Better molecular simulation is one of the most direct paths to improving that hit rate.
The core computational challenge is electronic structure: given the positions of the atomic nuclei in a molecule, compute the energy of the lowest-energy arrangement of electrons (the ground state). This ground state energy determines whether a drug molecule will bind tightly to its target protein, how it will be metabolised by enzymes in the liver, and whether it will form unwanted side products.
Classical methods for electronic structure divide into two families. Density functional theory (DFT) is fast and scales to thousands of atoms, but relies on approximations to the electron-electron interaction energy that break down badly for certain chemical environments. Coupled cluster methods (particularly CCSD(T), considered the “gold standard” of quantum chemistry) are far more accurate but scale as O(N^7) in the number of electrons, limiting practical use to molecules with fewer than roughly 50 heavy atoms.
The problem targets that matter most for Boehringer Ingelheim are precisely the ones where classical methods struggle. Cytochrome P450 enzymes metabolise the majority of drugs in the human body. Their active sites contain an iron atom coordinated by a porphyrin ring and a sulphur ligand, creating a strongly correlated electronic environment where DFT gives unreliable energies and CCSD(T) is computationally intractable at the full enzyme scale. Predicting how a candidate drug molecule binds to and is transformed by this enzyme active site is essential for understanding metabolism, toxicity, and dosing.
Why Quantum Computers
The many-body quantum mechanics of strongly correlated electrons is a natural target for quantum computers because the Hilbert space of N electrons grows exponentially with N, and a quantum computer’s state space also grows exponentially with qubit count. The argument, originally due to Feynman, is that simulating quantum systems is exponentially hard for classical computers but natural for quantum ones.
The Variational Quantum Eigensolver (VQE), introduced in 2016, is the leading near-term algorithm for molecular ground state energy estimation. VQE is a hybrid algorithm: a quantum circuit prepares a trial wave function, a classical computer measures the expectation value of the molecular Hamiltonian and updates the circuit parameters, and the loop iterates until convergence.
VQE does not require fault-tolerant hardware. It uses short circuits and tolerates a degree of noise, which makes it a candidate for today’s NISQ processors. The cost is that the variational approximation to the true ground state may not be tight, and classical optimisation of the variational parameters is hard (non-convex, many local minima).
Molecular Hamiltonian Setup with OpenFermion
OpenFermion is an open-source library for generating molecular Hamiltonians in forms suitable for quantum computers. The workflow starts from a classical quantum chemistry package (PySCF or Psi4) to compute the one- and two-electron integrals, then uses OpenFermion to encode the fermionic Hamiltonian as a sum of qubit operators.
from openfermion import MolecularData, get_fermion_operator
from openfermion.transforms import jordan_wigner, bravyi_kitaev
from openfermionpyscf import run_pyscf
import cirq
import numpy as np
# Define a small model of the cytochrome P450 active site:
# For VQE benchmarking, use the Fe(II)-porphyrin fragment
# reduced to a minimal active space (4 electrons in 4 orbitals)
# Full enzyme requires ~1000 logical qubits; this is a feasibility study
geometry = [
('Fe', (0.0, 0.0, 0.0)),
('N', (2.0, 0.0, 0.0)),
('N', (-2.0, 0.0, 0.0)),
('N', (0.0, 2.0, 0.0)),
('N', (0.0, -2.0, 0.0)),
]
molecule = MolecularData(
geometry=geometry,
basis='sto-3g', # minimal basis for demonstration
multiplicity=1,
charge=0,
description='fe_porphyrin_fragment'
)
# Run PySCF to get molecular integrals
molecule = run_pyscf(molecule, run_scf=True, run_mp2=False,
run_cisd=False, run_ccsd=True, run_fci=True)
print(f"Nuclear repulsion energy: {molecule.nuclear_repulsion:.4f} Ha")
print(f"HF energy: {molecule.hf_energy:.4f} Ha")
print(f"CCSD energy: {molecule.ccsd_energy:.4f} Ha")
print(f"FCI energy: {molecule.fci_energy:.4f} Ha (exact reference)")
# Generate second-quantised Hamiltonian
fermion_hamiltonian = get_fermion_operator(molecule.get_molecular_hamiltonian())
# Encode as qubit operator using Jordan-Wigner transformation
# Jordan-Wigner: each spin-orbital maps to one qubit; locality is not preserved
jw_hamiltonian = jordan_wigner(fermion_hamiltonian)
# Bravyi-Kitaev transformation: better locality, often fewer Pauli terms
bk_hamiltonian = bravyi_kitaev(fermion_hamiltonian)
print(f"\nJordan-Wigner Hamiltonian terms: {len(jw_hamiltonian.terms)}")
print(f"Bravyi-Kitaev Hamiltonian terms: {len(bk_hamiltonian.terms)}")
For the minimal active space of 4 electrons in 4 orbitals (4e, 4o), the Jordan-Wigner encoding requires 8 qubits (one per spin-orbital). The full cytochrome P450 active site relevant to drug metabolism involves roughly 50 electrons in 50 orbitals, requiring 100 qubits in the Jordan-Wigner encoding. With symmetry reduction and more compact encodings, this can be reduced to around 50-60 logical qubits, still beyond the reliable depth achievable on current hardware.
The UCCSD Ansatz in Cirq
VQE requires a parameterised circuit (the ansatz) that can represent the molecular ground state well. The Unitary Coupled Cluster Singles and Doubles (UCCSD) ansatz is motivated by classical coupled cluster theory and is known to give accurate ground state energies for weakly correlated molecules.
The UCCSD ansatz generates particle-hole excitations from the Hartree-Fock reference state. Singles excitations create single electron excitations between occupied and virtual orbitals; doubles excitations create two simultaneous excitations.
import cirq
from openfermion.circuits import uccsd_generator, uccsd_singlet_get_packed_amplitudes
from openfermioncirq import simulate_trotter
# Number of qubits for (4e, 4o) active space with Jordan-Wigner
n_qubits = 8
qubits = cirq.LineQubit.range(n_qubits)
# Hartree-Fock reference state: first n_electrons/2 alpha and beta orbitals occupied
# For 4 electrons in 8 spin-orbitals (4 alpha, 4 beta): |11110000>
def prepare_hf_state(qubits, n_electrons):
"""Prepare the Hartree-Fock reference state."""
circuit = cirq.Circuit()
for i in range(n_electrons):
circuit.append(cirq.X(qubits[i]))
return circuit
hf_circuit = prepare_hf_state(qubits, n_electrons=4)
# UCCSD variational parameters
# Packed amplitudes: single and double excitation angles
n_amplitudes = uccsd_singlet_get_packed_amplitudes(
single_amplitudes=molecule.ccsd_single_amps,
double_amplitudes=molecule.ccsd_double_amps,
n_qubits=n_qubits,
n_electrons=4
)
# VQE optimisation loop
def vqe_energy(amplitudes, molecule, n_qubits, qubits):
"""Evaluate VQE energy for given UCCSD amplitudes."""
# Build UCCSD circuit on top of HF reference
uccsd_circuit = hf_circuit.copy()
# Add UCCSD evolution: exp(T - T^dagger) where T is the cluster operator
# In practice, Trotterised or directly compiled
# OpenFermion provides utilities for this step
evolution_op = uccsd_generator(
packed_amplitudes=amplitudes,
n_qubits=n_qubits,
n_electrons=4
)
uccsd_circuit += cirq.Circuit(
cirq.ops.common_gates.ZPowGate().on(q) for q in qubits
) # placeholder; actual UCCSD uses parameterised rotations
return molecule.hf_energy # stand-in; real VQE measures <psi|H|psi>
# Real VQE loop uses scipy.optimize.minimize with the Sycamore sampler
from scipy.optimize import minimize
x0 = np.zeros(len(n_amplitudes))
result = minimize(vqe_energy, x0, args=(molecule, n_qubits, qubits),
method='L-BFGS-B', options={'maxiter': 1000})
print(f"VQE converged energy: {result.fun:.6f} Ha")
print(f"Correlation energy captured: "
f"{(result.fun - molecule.hf_energy):.6f} Ha")
On the (4e, 4o) active space, UCCSD with full singles and doubles involves roughly 36 variational parameters for this system size. Each energy evaluation requires running the Sycamore circuit approximately 10,000 times to estimate the Hamiltonian expectation value with sufficient precision. With 500-1000 optimisation iterations, a single VQE run requires millions of circuit executions.
Active Space Approximation
The full cytochrome P450 enzyme contains thousands of electrons. Simulating all of them is impossible with any algorithm. Active space approximation selects the subset of orbitals and electrons that contribute most to the electronic correlation energy relevant for the chemical question being asked.
For the CYP3A4 isoform (the most important drug-metabolising P450 enzyme), the key chemistry occurs at the iron-oxo active site. An active space of 14 electrons in 14 orbitals (14e, 14o) captures the essential iron 3d orbitals, the axial thiolate, and the key porphyrin pi orbitals. Jordan-Wigner encoding requires 28 qubits. With symmetry reduction (particle number, spin), this reduces to roughly 22-24 effective qubits.
At 24 qubits, UCCSD circuit depth on Sycamore (which has gate times of around 12-25 nanoseconds per two-qubit gate and coherence times of 10-20 microseconds) allows roughly 500-800 two-qubit gates before decoherence dominates. UCCSD doubles excitations each require several CNOT or iSWAP gates; a full 24-qubit UCCSD circuit exceeds this depth limit, requiring either truncation (UCCD without singles) or shallower hardware-efficient ansatze that sacrifice physical motivation for circuit depth.
VQE on Sycamore and Noise Mitigation
Boehringer Ingelheim and Google ran VQE experiments for fragments of 6-12 qubits on Sycamore. Key results:
- 6-qubit (2e, 3o) fragment: VQE converged to within 1 millihartree of FCI. This is chemical accuracy (1 kcal/mol), the benchmark required for drug-relevant energy differences.
- 10-qubit (4e, 5o) fragment: VQE with zero-noise extrapolation converged to within 3 millihartree of FCI. Noise without mitigation gave errors of 15-30 millihartree.
- 12-qubit (6e, 6o) fragment: VQE results were 8-12 millihartree above FCI even with noise mitigation. Chemical accuracy was not achieved at this scale on current hardware.
Zero-noise extrapolation (ZNE) was the primary mitigation technique, using gate folding to artificially amplify noise and then Richardson extrapolating to the zero-noise limit. This reduces systematic bias at the cost of additional circuit executions (typically 3-5x overhead for a two-point extrapolation).
The comparison to classical CCSD(T) at these small active space sizes is instructive: CCSD(T) matches FCI to within 0.5 millihartree on these examples in milliseconds on a laptop. VQE at 10 qubits on Sycamore matches FCI to within 3 millihartree in hours of total experiment time. This is not an advantage; it is a proof-of-principle.
Error Analysis and Resource Estimation for the Full Enzyme
Reaching the full (14e, 14o) active space relevant for CYP3A4 drug metabolism requires:
- Approximately 1000 logical qubits (accounting for error correction overhead at the required precision)
- A quantum phase estimation approach (which requires deeper circuits than VQE but gives more systematic convergence)
- Logical error rate below 10^-10 per gate, requiring physical error rates well below current NISQ levels
Physical qubit count projections for this calculation range from 100,000 to several million depending on the error correction code (surface code vs more efficient codes), the target logical error rate, and the problem’s required circuit depth.
Google’s internal roadmap projects devices with thousands of logical qubits in the late 2020s. The CYP3A4 simulation at full active space fidelity requires larger devices, placing it in the 2030s under optimistic projections.
Boehringer’s 10-Year Horizon
Boehringer Ingelheim’s strategic assessment, communicated in their 2023 quantum partnership announcement, outlines three phases:
Near-term (2023-2026): Use VQE and related algorithms on NISQ hardware to study small molecular fragments from drug-relevant targets. Develop workflows, error mitigation pipelines, and chemistry team familiarity. Compare rigorously to classical benchmarks.
Mid-term (2027-2030): Target active spaces of 20-30 electrons in 20-30 orbitals on early fault-tolerant hardware. This range includes several drug-relevant metalloenzyme active sites. Expect first results that are competitive with the best classical methods.
Long-term (2030+): Full active site simulations with chemical accuracy. This would provide genuine predictive power for drug metabolism and potentially enable computational screening of candidate drug modifications against CYP450 enzymes before synthesis.
The metabolic stability question is concrete: if a drug is too rapidly metabolised by CYP3A4, its plasma half-life is too short to be therapeutically useful. Accurate computational prediction of metabolic rates could eliminate candidates in silico before costly synthesis and biological testing.
Why This Chemistry Problem Matters
Quantum chemistry is considered one of the most credible near-term applications for quantum computing because the scaling advantage is clearest: simulating N electrons exactly requires resources exponential in N on a classical computer but polynomial on a quantum computer (for algorithms like quantum phase estimation).
The Boehringer collaboration is scientifically rigorous: they chose a target (CYP450 active sites) where classical methods genuinely struggle, developed a clear ladder of benchmarks from small to large, and published honest assessments of where hardware limitations currently prevent useful results.
For students of quantum computing, this case study illustrates the VQE workflow from Hamiltonian generation through ansatz construction, the role of active space approximation, and the gap between current capability and practical pharmaceutical relevance. The gap is real and large, but the theoretical case for quantum advantage in this application is among the strongest in the field.
Learn more: Cirq Reference