Active Space Methods for Quantum Chemistry in PennyLane

Overview

Solving the electronic structure problem exactly requires Full Configuration Interaction (FCI), which considers every possible way to distribute electrons among available orbitals. For water with 10 electrons in 7 spatial orbitals (14 spin-orbitals in STO-3G), FCI constructs the full Hilbert space of dimension C(14, 10), roughly 1,001 determinants. That is tractable for water, but the scaling is combinatorial: for benzene with 42 electrons and 30+ spatial orbitals, the FCI space dimension exceeds 10^18. No classical computer can diagonalize a matrix that large.

Active space methods solve this problem by partitioning the molecular orbitals into three groups:

Core (inactive) orbitals: Always doubly occupied. These orbitals sit far below the Fermi level and contribute negligibly to electron correlation. They remain frozen at the Hartree-Fock level.
Active orbitals: The chemically important subset where electron correlation matters most. Electrons in these orbitals are treated with an accurate many-body method (FCI within the active space, or in our case, VQE on a quantum computer).
Virtual (external) orbitals: Always empty. These sit far above the Fermi level and are excluded from the correlated calculation.

This partitioning is the foundation of Complete Active Space Self-Consistent Field (CASSCF) and Complete Active Space Configuration Interaction (CASCI) methods. The key advantage for quantum computing is clear: instead of mapping all orbitals to qubits, we only need qubits for the active space. A (4-electron, 4-orbital) active space requires 8 qubits instead of 14, and the savings grow dramatically for larger molecules.

PennyLane’s quantum chemistry module (qchem) provides tools to automate active space selection, construct the reduced Hamiltonian, and run VQE entirely within the PennyLane framework.

Installation

pip install pennylane pennylane-qchem pyscf

import pennylane as qml
from pennylane import qchem
import numpy as np

PySCF handles the classical Hartree-Fock calculation and integral computation behind the scenes. PennyLane wraps these into a clean interface for building the qubit Hamiltonian.

Defining a Molecule

Start with a simple molecule. Water (H2O) is a good test case because a full orbital simulation requires many qubits, but a modest active space captures the important correlation.

# Water molecule geometry (angstroms)
symbols = ["O", "H", "H"]
coordinates = np.array([
    [0.0000,  0.0000,  0.1173],
    [0.0000,  0.7572, -0.4692],
    [0.0000, -0.7572, -0.4692],
])

# Run Hartree-Fock with a minimal basis set
mol = qchem.Molecule(symbols, coordinates, basis_name="sto-3g")

This geometry places the oxygen at the origin with the two hydrogens at the experimental bond angle of approximately 104.5 degrees. The STO-3G basis set assigns orbitals as follows: oxygen contributes 1s, 2s, 2px, 2py, 2pz (5 spatial orbitals), and each hydrogen contributes 1s (1 spatial orbital each), for a total of 7 spatial orbitals and 14 spin-orbitals.

HOMO-LUMO Gap and Active Space Selection

The chemically important orbitals cluster near the Fermi level, the boundary between occupied and unoccupied orbitals in the Hartree-Fock ground state. Two orbitals define this boundary:

HOMO (Highest Occupied Molecular Orbital): The frontier occupied orbital. Electrons here are the least tightly bound and the most important for chemical reactivity, bond breaking, and electron donation.
LUMO (Lowest Unoccupied Molecular Orbital): The frontier virtual orbital. This is where electrons go during excitations, and it governs electron acceptance and low-energy excited states.

The HOMO-LUMO gap (their energy difference) determines many molecular properties: optical absorption, chemical hardness, and reactivity. A small gap means the molecule is easily excited or polarized; a large gap means it is chemically stable.

For constructing a good active space, select k orbitals centered on the HOMO-LUMO boundary. For H2O with 10 electrons in 7 spatial orbitals (STO-3G), the Hartree-Fock orbital ordering is:

Orbital index	Character	Classification
1	O 1s (core)	Core, frozen
2	O 2s (inner valence)	Core, frozen
3	O-H bonding	Core, frozen
4	O-H bonding (HOMO-1)	Active
5	O lone pair (HOMO)	Active
6	O-H antibonding (LUMO)	Active
7	O-H antibonding (LUMO+1)	Active

Orbitals 1 through 3 are frozen as core. Their electrons do not participate meaningfully in bond-breaking or correlation. Orbitals 4 through 7 form the (4-electron, 4-orbital) active space: 4 electrons from the two highest occupied orbitals are distributed among 4 spatial orbitals (the two HOMOs and two LUMOs).

PennyLane’s qchem.active_space function automates this selection, choosing orbitals symmetrically around the Fermi level.

Selecting the Active Space

PennyLane’s qchem.active_space function takes the full molecular orbital information and returns the indices of active electrons and orbitals.

# H2O with STO-3G has 10 electrons and 7 spatial orbitals
# Select an active space of 4 electrons in 4 orbitals
electrons = 4
orbitals = 4

active_electrons, active_orbitals = qchem.active_space(
    electrons=10,
    orbitals=7,
    mult=1,          # singlet spin multiplicity
    active_electrons=electrons,
    active_orbitals=orbitals,
)

print(f"Active electrons: {active_electrons}")
print(f"Active orbitals:  {active_orbitals}")

Here is what each parameter controls:

electrons=10: The total number of electrons in the molecule (8 from oxygen, 1 from each hydrogen).
orbitals=7: The total number of spatial orbitals in the basis set (5 from oxygen, 1 from each hydrogen in STO-3G).
mult=1: The spin multiplicity, 2S+1. For a singlet state (all electrons paired), mult=1. This tells the function how to distribute alpha and beta electrons.
active_electrons=4: How many electrons to place in the active space. These are the 4 electrons occupying the two highest occupied orbitals.
active_orbitals=4: How many spatial orbitals to include in the active space.

The function returns lists of orbital indices, confirming which orbitals are active and which are frozen.

Computing the qubit count. Each spatial orbital corresponds to two spin-orbitals (alpha and beta spin). Under the Jordan-Wigner mapping, each spin-orbital maps to one qubit. Therefore:

qubits = 2 * active_orbitals

For our (4e, 4o) active space: 2 * 4 = 8 qubits. Compare this to the full space: 2 * 7 = 14 qubits. The active space saves 6 qubits, which corresponds to a factor-of-64 reduction in Hilbert space dimension.

Building the Hamiltonian

Use the active space selection to construct a reduced Hamiltonian. This is the operator VQE will minimize.

H, qubits = qchem.molecular_hamiltonian(
    symbols,
    coordinates,
    basis="sto-3g",
    active_electrons=electrons,    # integer count of active electrons
    active_orbitals=orbitals,      # integer count of active orbitals
)

print(f"Qubits required: {qubits}")
print(f"Hamiltonian terms: {len(H.operands)}")

Reducing from the full orbital space to the active space cuts the qubit count significantly, in this case from 14 qubits to 8.

Why Jordan-Wigner Encoding?

The molecular Hamiltonian is expressed in second quantization using fermionic creation and annihilation operators. These operators obey anti-commutation relations (the Pauli exclusion principle). Qubits, however, are inherently bosonic: qubit operators commute across different sites. To simulate fermions on a qubit register, we need a mapping that encodes the fermionic anti-symmetry.

The Jordan-Wigner (JW) transformation accomplishes this by mapping each spin-orbital to one qubit:

Occupied spin-orbital maps to |1>
Empty spin-orbital maps to |0>
Creation/annihilation operators become products of Pauli X, Y, and Z operators, where Z strings enforce the anti-commutation relations

The resulting qubit Hamiltonian is a weighted sum of Pauli strings (tensor products of I, X, Y, Z operators). For a (4e, 4o) active space with 8 qubits, you typically see dozens to hundreds of Pauli terms. The number of terms grows polynomially with the number of orbitals (roughly as O(N^4) for a two-body Hamiltonian), which is why VQE remains feasible even as the active space grows.

Other fermion-to-qubit mappings exist (Bravyi-Kitaev, parity mapping), each with different tradeoffs in Pauli weight and circuit depth. PennyLane uses Jordan-Wigner by default, which is the most straightforward to interpret: qubit i directly represents spin-orbital i.

Building the Ansatz

Use the AllSinglesDoubles ansatz, which generates single and double excitations from the Hartree-Fock reference state.

What “Singles and Doubles” Means Physically

The Hartree-Fock (HF) state is the simplest approximation to the electronic ground state. It describes all electrons as independent particles occupying molecular orbitals, represented as a single Slater determinant. HF captures roughly 99% of the total electronic energy, but the remaining 1% (the correlation energy) determines chemical accuracy for bond energies, reaction barriers, and molecular properties.

To recover correlation energy, we include excited determinants:

Single excitations promote one electron from an occupied orbital i to a virtual orbital a. These correct the orbital shapes and are important for properties like dipole moments, but they contribute relatively little to the correlation energy (by Brillouin’s theorem, singles do not mix directly with the HF determinant in the energy expression).
Double excitations promote two electrons simultaneously from occupied orbitals (i, j) to virtual orbitals (a, b). Doubles capture the dominant pairwise electron correlation and are responsible for most of the correlation energy recovery.

CISD (Configuration Interaction Singles and Doubles) includes all such excitations as fixed coefficients. In VQE, these excitations become parameterized quantum gates: SingleExcitation(theta, wires=[i, a]) and DoubleExcitation(theta, wires=[i, j, a, b]). The classical optimizer tunes the rotation angles to minimize the energy expectation value.

Counting Excitations

For a (4e, 8-qubit) active space with 4 electrons (2 alpha, 2 beta) in 8 spin-orbitals:

Occupied spin-orbitals: 4 (indices 0, 1, 2, 3)
Virtual spin-orbitals: 4 (indices 4, 5, 6, 7)
Singles: Each occupied-to-virtual pair gives one excitation. With spin conservation, you get single excitations within alpha and within beta spin blocks.
Doubles: Each pair of occupied to each pair of virtual, again respecting spin symmetry.

The excitations function computes the valid excitations automatically:

from pennylane.qchem import hf_state, excitations

hf = hf_state(electrons, qubits)   # use integer count, not list of indices
singles, doubles = excitations(electrons, qubits)

print(f"HF state: {hf}")
print(f"Singles: {len(singles)}, Doubles: {len(doubles)}")

For this active space, expect roughly 8 singles and 12-18 doubles (the exact count depends on spin-symmetry filtering). Each excitation adds one variational parameter to the circuit.

dev = qml.device("default.qubit", wires=qubits)

@qml.qnode(dev)
def circuit(params, excitations):
    qml.BasisState(hf, wires=range(qubits))
    for i, excitation in enumerate(doubles):
        qml.DoubleExcitation(params[i], wires=excitation)
    for i, excitation in enumerate(singles):
        qml.SingleExcitation(params[len(doubles) + i], wires=excitation)
    return qml.expval(H)

Note the ordering: doubles are applied before singles. This is a common convention because doubles contribute more to correlation energy, and applying them first allows the singles to further relax the orbitals on top of the correlated state.

Running VQE

Optimize the circuit parameters to find the ground state energy within the active space.

n_params = len(singles) + len(doubles)
params = qml.numpy.zeros(n_params, requires_grad=True)

opt = qml.GradientDescentOptimizer(stepsize=0.4)
all_excitations = doubles + singles

for step in range(20):
    params, energy = opt.step_and_cost(lambda p: circuit(p, all_excitations), params)
    if step % 10 == 0:
        print(f"Step {step:3d} | Energy = {energy:.6f} Ha")

Energy Units and Chemical Accuracy

VQE returns the energy in Hartree (Ha), the atomic unit of energy. Useful conversions:

Unit	Value
1 Hartree	27.211 eV
1 Hartree	627.5 kcal/mol
1 milliHartree (mHa)	0.627 kcal/mol
1 milliHartree (mHa)	27.211 meV

Chemical accuracy is the standard threshold for computational chemistry: 1 kcal/mol, approximately 1.6 mHa. If your VQE energy agrees with the FCI or CASSCF reference to within 1.6 mHa, the result is considered chemically accurate, meaning it can reliably predict thermochemical properties such as bond dissociation energies, reaction enthalpies, and activation barriers.

Convergence Checking

Monitor convergence by tracking the energy change between iterations. VQE has converged when the energy change drops below your target threshold:

energy_prev = 0.0
threshold = 1e-6  # Ha

for step in range(100):
    params, energy = opt.step_and_cost(lambda p: circuit(p, all_excitations), params)
    delta = abs(energy - energy_prev)
    if step % 10 == 0:
        print(f"Step {step:3d} | Energy = {energy:.8f} Ha | Delta = {delta:.2e}")
    if delta < threshold and step > 0:
        print(f"Converged at step {step} with energy {energy:.8f} Ha")
        break
    energy_prev = energy

If VQE does not converge within a reasonable number of steps, consider reducing the optimizer step size, switching to a more robust optimizer (such as qml.AdamOptimizer), or checking that the initial parameters are reasonable.

Comparing to FCI Reference

The active space energy is an approximation. Compare against FCI or CCSD to assess accuracy.

# CASSCF energy (classical, from PySCF)
# For H2O STO-3G with (4e, 4o) active space
casscf_energy = -75.0193   # hartree (example value)
vqe_energy = circuit(params, all_excitations)

print(f"VQE (active space):  {vqe_energy:.6f} Ha")
print(f"CASSCF reference:    {casscf_energy:.6f} Ha")
print(f"Difference:          {abs(vqe_energy - casscf_energy) * 1000:.3f} mHa")

A difference below 1.6 mHa (chemical accuracy) indicates the active space and ansatz are well chosen. If the VQE energy is higher than the CASSCF reference, the ansatz may lack expressibility (missing higher excitations) or the optimizer may be stuck in a local minimum. If the VQE energy is lower than the CASSCF reference, check for bugs; the variational principle guarantees the true ground state energy is a lower bound.

Choosing a Larger Active Space

If the result is not chemically accurate, expand the active space. Each additional orbital adds two qubits.

# Larger active space: 6 electrons in 5 orbitals (H2O STO-3G has 7 orbitals total,
# 2 core orbitals are frozen, leaving 5 available for the active space)
H_large, qubits_large = qchem.molecular_hamiltonian(
    symbols,
    coordinates,
    basis="sto-3g",
    active_electrons=6,
    active_orbitals=5,
)

print(f"Larger active space qubits: {qubits_large}")
print(f"Larger Hamiltonian terms:   {len(H_large.operands)}")

Going from (4e, 4o) to (6e, 5o) increases the qubit count from 8 to 10 and roughly doubles the number of Hamiltonian terms and variational parameters. The improvement in accuracy depends on whether the newly included orbitals contribute meaningfully to correlation. For water, the 2s orbital of oxygen (orbital 2) has moderate correlation with the valence orbitals, so including it can improve the energy by a few milliHartree.

Basis Set Considerations

The basis set determines how many spatial orbitals are available and how accurately they represent the true electronic wavefunction. STO-3G is a minimal basis set: it provides exactly the number of orbitals needed to hold all electrons, but its flexibility is very limited.

Larger basis sets include additional orbitals (polarization functions, diffuse functions) that improve the description of electron correlation and molecular properties. The tradeoff is more qubits:

Basis set	Spatial orbitals (H2O)	Full qubit count	(4e, 4o) qubits
STO-3G	7	14	8
6-31G	13	26	8
6-31G*	19	38	8
cc-pVDZ	24	48	8
cc-pVTZ	58	116	8

Notice that the active space qubit count stays fixed at 8 for a (4e, 4o) calculation regardless of basis set. The basis set affects the quality of the underlying orbitals (through the Hartree-Fock calculation) and the number of virtual orbitals available if you expand the active space. With cc-pVDZ, you could choose a (4e, 10o) active space with 20 qubits, gaining access to higher-quality virtual orbitals that STO-3G simply does not have.

For production quantum chemistry, STO-3G is rarely sufficient. The 6-31G* basis (also called 6-31G(d)) is a common starting point for quantitative work. However, larger basis sets require larger active spaces to exploit their additional orbitals, which increases the qubit and gate requirements.

Common Mistakes

Selecting too small an active space

Freezing orbitals that participate in important correlation leads to poor accuracy. For example, if you study the O-H bond dissociation of water but freeze the bonding orbital, VQE cannot describe the bond breaking. Always include the orbitals directly involved in the chemistry you want to capture.

Selecting too large an active space

Each additional active orbital adds 2 qubits, increases the Hamiltonian term count, and adds more variational parameters. On NISQ hardware with limited qubit counts and high gate error rates, an unnecessarily large active space produces deeper circuits with more noise. Start small and expand incrementally.

Not verifying the Hartree-Fock reference energy

VQE initializes from the Hartree-Fock state. If the HF calculation fails to converge, uses the wrong geometry, or has symmetry-breaking issues, the VQE starting point is wrong. Always verify the HF energy against a known reference before running VQE:

# Compute HF energy with PySCF directly for verification
from pyscf import gto, scf

pyscf_mol = gto.M(atom="O 0 0 0.1173; H 0 0.7572 -0.4692; H 0 -0.7572 -0.4692",
                   basis="sto-3g", unit="Angstrom")
mf = scf.RHF(pyscf_mol).run()
print(f"HF energy: {mf.e_tot:.6f} Ha")

Forgetting to freeze core orbitals

The oxygen 1s orbital (orbital 1) sits at roughly -20 Ha, far below the valence orbitals at -1 to +1 Ha. Including it in the active space adds 2 qubits but provides negligible correlation energy (typically less than 0.01 mHa). Worse, the large energy separation between core and valence orbitals can cause numerical issues in the optimizer. Always freeze deep core orbitals unless you have a specific reason to correlate them (such as studying core-level spectroscopy).

Ignoring spin symmetry

For closed-shell singlet molecules like water, the number of active electrons should be even, and alpha and beta electron counts should match. Requesting an odd number of active electrons for a singlet state produces an unphysical wavefunction. The mult parameter in qchem.active_space guards against this, but it is worth understanding why.

Summary

Active space methods let you concentrate quantum resources on the orbitals that matter most for molecular correlation. The workflow is:

Run Hartree-Fock to obtain the molecular orbital basis and verify the reference energy.
Identify the HOMO-LUMO region and select an active space centered on the Fermi level using qchem.active_space.
Build the reduced Hamiltonian with qchem.molecular_hamiltonian, specifying the active electron and orbital counts.
Construct a singles-and-doubles ansatz from the Hartree-Fock reference state within the active space.
Run VQE with convergence monitoring, targeting chemical accuracy (1.6 mHa) compared to CASSCF or FCI benchmarks.
If accuracy is insufficient, expand the active space incrementally or upgrade the basis set.

The active space approach is the primary strategy for making molecular simulations feasible on near-term quantum hardware. By freezing chemically inert orbitals and focusing qubits on the correlated electrons, you can study meaningful chemistry within the constraints of current devices.