Graph Coloring with D-Wave Ocean

Introduction

Graph coloring is one of the most widely studied problems in combinatorial optimization. The problem statement is simple: given a graph G = (V, E) and k colors, assign exactly one color to each vertex so that no two adjacent vertices share the same color. A coloring that satisfies this constraint is called a proper k-coloring. Finding the minimum k for which a proper coloring exists (the chromatic number of the graph) is NP-hard for k >= 3.

Despite the simplicity of the statement, graph coloring appears throughout computer science and engineering under different names:

Register allocation in compilers. When a compiler assigns CPU registers to program variables, it builds an interference graph where variables that are “live” at the same time become adjacent vertices. Colors correspond to registers. A valid coloring assigns registers so that no two simultaneously-live variables share a register.
Exam scheduling. Students who share a course cannot have their exams in the same time slot. Each course is a vertex, edges connect courses with overlapping enrollment, and colors represent time slots. A proper coloring produces a conflict-free schedule.
Wireless frequency assignment. Radio towers that are geographically close enough to interfere with each other form adjacent vertices. Available frequencies are the colors. A proper coloring ensures that no two nearby towers broadcast on the same frequency.
Map coloring. Adjacent countries on a map need different colors for visual clarity. The four-color theorem guarantees that k = 4 always suffices for any planar map, but finding an optimal coloring for a specific map is still computationally hard.

In this tutorial, you formulate graph coloring as a QUBO (Quadratic Unconstrained Binary Optimization), solve it with D-Wave’s simulated annealer and real quantum hardware, and learn how to tune, decode, and scale the approach.

The Petersen Graph

Throughout this tutorial, we use the Petersen graph as the test case. The Petersen graph has 10 vertices and 15 edges. It is 3-regular, meaning every vertex has exactly degree 3. It is also vertex-transitive: every vertex “looks the same” from a symmetry perspective. Its chromatic number is exactly 3, so it requires at least 3 colors for a proper coloring.

Why choose it? The Petersen graph is small enough to solve quickly on a classical computer (so you can verify results), but non-trivial enough to exercise the QUBO formulation properly. It is one of the standard benchmark graphs in combinatorial algorithm research. Many graph theory counterexamples involve the Petersen graph, so it is a natural first test for any graph algorithm implementation.

QUBO Formulation

D-Wave’s quantum annealer solves QUBO problems. To use it for graph coloring, we encode the problem using binary decision variables and penalty terms.

Binary Variables

Define a binary variable x_{i,c} for each vertex i and each color c:

x_{i,c} = 1 means vertex i is assigned color c
x_{i,c} = 0 means vertex i is not assigned color c

Why binary variables? Each qubit on the D-Wave hardware is naturally a binary variable (0 or 1). To represent a discrete choice (which color does vertex i get?), we use a one-hot encoding: for each vertex, exactly one of the k color variables is 1 and the rest are 0.

The total number of binary variables is N * k, where N is the number of vertices and k is the number of colors. For the Petersen graph with 3 colors, that is 10 * 3 = 30 binary variables.

Constraint 1: One Color Per Node

For each vertex i, exactly one x_{i,c} must equal 1. We enforce this with a penalty term that equals zero when the constraint is satisfied and is positive otherwise:

P1 = A * sum_i (1 - sum_c x_{i,c})^2

Here A > 0 is a penalty weight. When exactly one color variable is 1, the inner expression is (1 - 1)^2 = 0. When zero or two or more are 1, the penalty is positive.

Let us expand this explicitly for k = 3 colors. For a single vertex i:

(1 - x_{i,0} - x_{i,1} - x_{i,2})^2
  = 1
    - 2*x_{i,0} - 2*x_{i,1} - 2*x_{i,2}
    + x_{i,0}^2 + x_{i,1}^2 + x_{i,2}^2
    + 2*x_{i,0}*x_{i,1} + 2*x_{i,0}*x_{i,2} + 2*x_{i,1}*x_{i,2}

Since x is binary, x^2 = x. Substituting:

  = 1
    - 2*x_{i,0} - 2*x_{i,1} - 2*x_{i,2}
    + x_{i,0} + x_{i,1} + x_{i,2}
    + 2*x_{i,0}*x_{i,1} + 2*x_{i,0}*x_{i,2} + 2*x_{i,1}*x_{i,2}

Combining the linear terms:

  = 1
    - x_{i,0} - x_{i,1} - x_{i,2}
    + 2*x_{i,0}*x_{i,1} + 2*x_{i,0}*x_{i,2} + 2*x_{i,1}*x_{i,2}

The constant term (1) shifts the overall energy but does not affect which solution the annealer selects, so we can drop it from the QUBO. What remains are:

Linear (diagonal) terms: coefficient -A for each x_{i,c}
Quadratic (off-diagonal) terms: coefficient +2A for each pair (x_{i,c1}, x_{i,c2}) where c1 < c2

These are the terms that appear in the code below.

Constraint 2: Adjacent Nodes Differ

For each edge (i, j) in the graph and each color c, we penalize the case where both x_{i,c} = 1 and x_{j,c} = 1:

P2 = B * sum_{(i,j) in E} sum_c x_{i,c} * x_{j,c}

This term is already quadratic. If vertex i and vertex j share color c, the product x_{i,c} * x_{j,c} = 1, incurring a penalty of B. If they differ, the product is 0.

The Full Objective and Zero Energy

The full QUBO objective is:

Q = P1 + P2

A valid coloring satisfies all constraints, so every penalty term is 0. This means valid colorings have energy exactly 0 (ignoring the dropped constant). Invalid colorings have positive energy. The annealer minimizes energy, so it naturally searches for valid colorings.

This is a pure constraint satisfaction problem: there is no additional objective to optimize beyond satisfying the constraints. If you wanted to also minimize the number of colors used, you would add a third term to the objective, but that is beyond the scope of this tutorial.

Setup

pip install dwave-ocean-sdk networkx

Configure your D-Wave API token:

dwave config create

Or set the environment variable DWAVE_API_TOKEN.

Formulating the QUBO with dimod

import networkx as nx
import dimod
from itertools import product

# Small test graph: Petersen graph (chromatic number 3)
G = nx.petersen_graph()
n_nodes = G.number_of_nodes()
k = 3  # number of colors
colors = list(range(k))

A = 6.0  # Penalty weight for one-color-per-node constraint
B = 6.0  # Penalty weight for no-adjacent-same-color constraint

# Variable labeling: (node, color) -> binary variable index
def var(node, color):
    return f"x_{node}_{color}"

Q = {}

def add_qubo(i, j, val):
    key = (i, j) if i <= j else (j, i)
    Q[key] = Q.get(key, 0.0) + val

# Constraint 1: each node gets exactly one color
for node in G.nodes():
    # Expand (1 - sum_c x_{node,c})^2
    # Linear terms: coefficient -2A for each x_{node,c}; constant A ignored (shifts energy)
    for c in colors:
        add_qubo(var(node, c), var(node, c), -A)
    # Cross terms: 2A for each pair (c1, c2), c1 < c2
    for c1 in colors:
        for c2 in colors:
            if c1 < c2:
                add_qubo(var(node, c1), var(node, c2), 2 * A)

# Constraint 2: adjacent nodes must differ in color
for (u, v) in G.edges():
    for c in colors:
        add_qubo(var(u, c), var(v, c), B)

bqm = dimod.BinaryQuadraticModel.from_qubo(Q)
print(f"Variables: {bqm.num_variables}, Interactions: {bqm.num_interactions}")

Solving on a Simulated Annealer (Local Test)

Before using real D-Wave hardware, validate the formulation with the classical simulated annealer included in Ocean. Note that older docs import this sampler from dimod; the maintained implementation now ships in the dwave.samplers package:

from dwave.samplers import SimulatedAnnealingSampler

sampler = SimulatedAnnealingSampler()
response = sampler.sample(bqm, num_reads=1000, num_sweeps=1000)

best = response.first
print(f"Best energy: {best.energy:.2f}")

# Decode the solution
coloring = {}
for node in G.nodes():
    assigned = [c for c in colors if best.sample.get(var(node, c), 0) == 1]
    coloring[node] = assigned[0] if assigned else None

print("Node coloring:", coloring)

# Verify: check no adjacent nodes share a color
violations = [(u, v) for (u, v) in G.edges() if coloring[u] == coloring[v]]
print(f"Constraint violations: {len(violations)}")

An energy of 0 means both constraint families are satisfied and a valid coloring was found.

Understanding the Solution Decoding

The raw output from any D-Wave sampler is a dictionary mapping variable names to binary values. For graph coloring, it looks like this:

{
    'x_0_0': 1, 'x_0_1': 0, 'x_0_2': 0,
    'x_1_0': 0, 'x_1_1': 1, 'x_1_2': 0,
    'x_2_0': 0, 'x_2_1': 0, 'x_2_2': 1,
    ...
}

Each group of k variables corresponds to one vertex. The variable that equals 1 tells you the assigned color. In the example above:

Vertex 0 is assigned color 0 (because x_0_0 = 1)
Vertex 1 is assigned color 1 (because x_1_1 = 1)
Vertex 2 is assigned color 2 (because x_2_2 = 1)

The decoding loop in the code collects, for each vertex, the list of colors where the variable is 1 and picks the first one. If the list is empty (all zeros for that vertex), the vertex gets None, indicating the annealer failed to assign it a color.

After decoding, you should always verify the coloring. Here is a more thorough verification using NetworkX:

def verify_coloring(G, coloring, k):
    """Check that a coloring is valid and report any issues."""
    issues = []

    # Check that every node has a color assigned
    for node in G.nodes():
        if coloring[node] is None:
            issues.append(f"Node {node}: no color assigned")
        elif coloring[node] not in range(k):
            issues.append(f"Node {node}: invalid color {coloring[node]}")

    # Check that no adjacent nodes share a color
    for (u, v) in G.edges():
        if coloring[u] is not None and coloring[v] is not None:
            if coloring[u] == coloring[v]:
                issues.append(f"Edge ({u}, {v}): both have color {coloring[u]}")

    if not issues:
        print("Valid coloring!")
    else:
        print(f"Found {len(issues)} issue(s):")
        for issue in issues:
            print(f"  - {issue}")

    return len(issues) == 0

verify_coloring(G, coloring, k)

You can also check the one-hot constraint directly on the raw sample:

def check_one_hot(sample, n_nodes, k):
    """Verify that each node has exactly one color set to 1."""
    for node in range(n_nodes):
        active = sum(sample.get(var(node, c), 0) for c in range(k))
        if active != 1:
            print(f"Node {node}: {active} colors active (expected 1)")
    print("One-hot check complete.")

check_one_hot(best.sample, n_nodes, k)

Solving on D-Wave Hardware

from dwave.system import DWaveSampler, EmbeddingComposite

sampler_hw = EmbeddingComposite(DWaveSampler())
response_hw = sampler_hw.sample(bqm, num_reads=200, annealing_time=20)

best_hw = response_hw.first
print(f"Hardware best energy: {best_hw.energy:.2f}")

coloring_hw = {}
for node in G.nodes():
    assigned = [c for c in colors if best_hw.sample.get(var(node, c), 0) == 1]
    coloring_hw[node] = assigned[0] if assigned else None

violations_hw = [(u, v) for (u, v) in G.edges() if coloring_hw[u] == coloring_hw[v]]
print(f"Hardware violations: {len(violations_hw)}")

EmbeddingComposite handles the minor embedding step automatically, mapping logical variables onto physical qubits. For larger graphs, use LazyFixedEmbeddingComposite to reuse an embedding across multiple runs.

Tuning Penalty Weights

If you see many violations, increase A and B. If solutions cluster at the minimum but are slow to find, reduce them slightly. Getting the penalty weights right is one of the most important practical skills in QUBO formulation.

Why the Weights Matter

Graph coloring is a pure constraint satisfaction problem. There is no separate objective function to minimize; the entire QUBO consists of penalty terms. A valid coloring has energy 0, and any violation pushes the energy above 0. The penalty weights A and B control how strongly the annealer is discouraged from violating each constraint family.

The two constraints can compete with each other. If B is much larger than A, the annealer might satisfy the adjacency constraint (Constraint 2) by leaving some vertices with no color at all, violating Constraint 1 instead. If A is much larger than B, the annealer assigns exactly one color per vertex but might allow adjacent vertices to share a color.

The Rule of Thumb

A practical starting point is A = B = max_degree(G) + 1. The reasoning: consider a single vertex i with degree d. Violating Constraint 1 at vertex i (by leaving it uncolored) saves at most B * d in Constraint 2 penalties, because the vertex touches d edges. For Constraint 1 to dominate this incentive, A must exceed B * d / (penalty for a one-hot violation). Since a one-hot violation costs A, we need A > B * d, which for A = B gives A > d, so A = d + 1 suffices.

For the Petersen graph with max degree 3, this gives A = B = 4. We use A = B = 6 in the code above for extra margin.

Penalty Sweep

In practice, it helps to sweep over penalty values and compare results:

from dwave.samplers import SimulatedAnnealingSampler

sampler = SimulatedAnnealingSampler()

for penalty in [2.0, 4.0, 6.0, 8.0]:
    # Rebuild the QUBO with the new penalty weights
    Q_sweep = {}

    def add_sweep(i, j, val):
        key = (i, j) if i <= j else (j, i)
        Q_sweep[key] = Q_sweep.get(key, 0.0) + val

    for node in G.nodes():
        for c in colors:
            add_sweep(var(node, c), var(node, c), -penalty)
        for c1 in colors:
            for c2 in colors:
                if c1 < c2:
                    add_sweep(var(node, c1), var(node, c2), 2 * penalty)

    for (u, v) in G.edges():
        for c in colors:
            add_sweep(var(u, c), var(v, c), penalty)

    bqm_sweep = dimod.BinaryQuadraticModel.from_qubo(Q_sweep)
    response_sweep = sampler.sample(bqm_sweep, num_reads=500, num_sweeps=1000)

    # Count how many of the 500 reads found a valid coloring (energy = 0)
    valid_count = sum(1 for datum in response_sweep.data()
                      if abs(datum.energy) < 1e-6)

    print(f"A = B = {penalty:.1f}: valid colorings = {valid_count}/500 "
          f"(best energy = {response_sweep.first.energy:.2f})")

You should see that A = B = 2 produces many violations (the penalty is too weak to enforce constraints), while A = B = 4 and above produce mostly valid colorings. Values that are too large (say, A = B = 100) still work but can make the energy landscape steeper, which sometimes slows convergence on hardware.

Scaling to Larger Graphs

The QUBO for graph coloring has N * k variables and roughly N * k * (k - 1) / 2 + |E| * k interactions. Here is how this scales:

Graph size	Colors	Variables	Approx. interactions	Approach
10 nodes (Petersen)	3	30	~75	Direct QPU or SimulatedAnnealing
100 nodes	4	400	~1,800	Direct QPU (Advantage has 5,000+ qubits)
1,000 nodes	4	4,000	~18,000	Too large for direct embedding; use LeapHybridSampler
10,000 nodes	4	40,000	~180,000	Definitely hybrid

For problems that exceed the QPU’s capacity, D-Wave provides LeapHybridSampler, which partitions the problem and uses a combination of classical and quantum resources. Switching to it requires minimal code changes:

from dwave.system import LeapHybridSampler

# Build the BQM the same way as before (using the code above)
# ...

# For large problems, substitute LeapHybridSampler
sampler_hybrid = LeapHybridSampler()
response_hybrid = sampler_hybrid.sample(bqm)

best_hybrid = response_hybrid.first
print(f"Hybrid best energy: {best_hybrid.energy:.2f}")

Note that LeapHybridSampler does not accept num_reads the same way the QPU sampler does. It returns a single best solution per call. It also requires a Leap account with hybrid solver access.

For very large graphs, you can also consider decomposition strategies: partition the graph into subgraphs, color each subgraph independently, and then reconcile the boundaries. The dwave-hybrid package provides tools for this workflow, but the details are beyond the scope of this tutorial.

Common Mistakes

Here are pitfalls that frequently trip up people new to QUBO-based graph coloring:

Setting A = B = 1

When penalty weights are too small, the annealer finds it “cheap” to violate constraints. For the Petersen graph, A = B = 1 almost always produces invalid colorings because the penalty for violating one constraint is comparable to the penalty for violating another. The constraints do not dominate the landscape. Always start with A = B = max_degree + 1 or higher.

Expecting Negative Energy for Valid Solutions

A valid coloring has energy exactly 0 (after dropping the constant). This catches many people off guard: the “best possible” energy is not some large negative number, it is zero. If you see a negative energy, you have a bug in the QUBO construction (likely double-counted a term or used the wrong sign). If you see a small positive energy close to zero, the solution is almost valid but has a minor violation.

Not Checking Embedding Feasibility

EmbeddingComposite attempts to find a minor embedding of your logical graph into the QPU’s physical topology. For large or dense graphs, this can fail with a timeout error or produce an embedding that uses so many physical qubits per logical variable that the results are noisy. Before committing to a long QPU run, check the embedding:

import dimod
from minorminer import find_embedding
from dwave.system import DWaveSampler

qpu = DWaveSampler()
target = qpu.to_networkx_graph()
source = dimod.to_networkx_graph(bqm)

try:
    embedding = find_embedding(source, target, timeout=60)
    max_chain = max(len(chain) for chain in embedding.values())
    print(f"Embedding found. Max chain length: {max_chain}")
    if max_chain > 7:
        print("Warning: long chains degrade solution quality. "
              "Consider LeapHybridSampler instead.")
except Exception as e:
    print(f"Embedding failed: {e}")
    print("Problem is too large for direct QPU. Use LeapHybridSampler.")

Chain lengths above 7 or so significantly increase the chance of chain breaks (where physical qubits in the same chain disagree), which corrupts the logical variable’s value.

Handling All-Zero Vertex Assignments

When the annealer fails to find a valid coloring, some vertices may have all their color variables set to 0. The decoding code handles this by assigning None, but if you feed the result into downstream code that expects integer colors, you get crashes or silent bugs. Always check for None values before using the coloring:

uncolored = [node for node in G.nodes() if coloring[node] is None]
if uncolored:
    print(f"Warning: {len(uncolored)} nodes have no color assigned: {uncolored}")
    print("Try increasing penalty weights or num_reads.")

Similarly, some vertices might have two or more color variables set to 1 (a one-hot violation). The simple decoding code picks the first match, which silently hides the violation. The check_one_hot function above catches this.

What You Can Reuse From This Tutorial

The QUBO pattern in this tutorial follows a general recipe:

One-hot encoding for discrete choices (assign exactly one option from a set)
Pairwise penalty terms for interaction constraints (adjacent elements must differ)
Penalty weight tuning to balance competing constraint families

This same recipe applies directly to many other combinatorial optimization problems:

Maximum independent set. Find the largest set of vertices with no two adjacent. Use one binary variable per vertex (x_i = 1 if vertex i is in the set). The penalty term for adjacency is the same as Constraint 2 here, with k = 1. The objective term is -sum(x_i), which the annealer minimizes to maximize the set size.
Minimum vertex cover. Find the smallest set of vertices that touches every edge. This is the complement of independent set: use x_i = 1 if vertex i is in the cover, penalize edges where neither endpoint is covered (x_i = 0 and x_j = 0), and minimize sum(x_i).
Set cover / set packing. Given a collection of sets, choose subsets that cover (or pack into) a universe. The one-hot constraint generalizes to “at least one set covers each element” or “at most one set contains each element.” The pairwise penalty generalizes to penalizing overlapping sets.
Clique finding. Find the largest complete subgraph. Use one binary variable per vertex. Penalize pairs of selected vertices that are not connected by an edge (the opposite of graph coloring, which penalizes pairs that are connected and share a color). Maximize sum(x_i).
Exam scheduling / job shop scheduling. These are graph coloring in disguise. Build the conflict graph and apply the same QUBO. If you also want to minimize the number of time slots (colors), add a term that penalizes using more colors.

The key insight is that one-hot encoding plus pairwise penalties is a universal building block for QUBO formulations of constraint satisfaction problems. Once you are comfortable with graph coloring, adapting the pattern to new problems is mostly a matter of identifying which constraints map to one-hot terms and which map to pairwise penalties.

Key Points

Graph coloring is formulated as a QUBO using N * k binary variables and two families of penalty terms.
Valid colorings have energy 0. The annealer minimizes energy, naturally searching for valid solutions.
Penalty weights should be set to at least max_degree + 1 as a starting point.
Always verify solutions by checking both the one-hot constraint and the adjacency constraint.
For problems with more than roughly 1,000 variables, use LeapHybridSampler instead of direct QPU access.
The one-hot plus pairwise penalty pattern generalizes to many combinatorial optimization problems.