We implement a parallel graph coloring solver using tf::TaskGroup for recursive task parallelism. This example demonstrates branch-and-bound search with symmetry breaking on a dynamically growing task tree, and shows how tf::TaskGroup enables efficient parallel exploration of combinatorial search spaces.
Graph coloring assigns a color to each vertex of a graph such that no two adjacent vertices (vertices connected by an edge) share the same color. The goal is to find an assignment that uses as few colors as possible. The minimum number of colors needed is called the chromatic number of the graph.
Graph coloring appears throughout computer science and engineering. In compiler register allocation, each variable is a vertex and two variables share an edge if they are live at the same time; coloring the graph assigns registers to variables without conflicts. In exam scheduling, each course is a vertex and two courses share an edge if any student is enrolled in both; coloring assigns time slots with no student taking two exams simultaneously. In wireless network frequency assignment, transmitters that could interfere with each other share an edge; coloring assigns frequencies to avoid interference.
Finding the chromatic number is NP-hard in general. The only known exact algorithms explore an exponential search space, which makes parallelism essential for large graphs. Each branch of the search can be explored independently on a separate CPU core, and branches that cannot improve on the best coloring found so far can be pruned immediately.
Consider the following 5-vertex graph, which is a 5-cycle (0-1-2-3-4-0) with one additional edge between vertices 0 and 2:
The edges are: 0-1, 0-2, 0-4, 1-2, 2-3, 3-4. The chromatic number of this graph is 3: vertices 0, 1, and 2 form a triangle (all mutually adjacent), so they each need a distinct color, and the remaining vertices can reuse those three colors. The branch-and-bound algorithm assigns colors to vertices one by one in order (0, 1, 2, 3, 4). At each vertex it tries all valid colors and prunes whenever the partial coloring already uses as many colors as the best complete coloring found so far.
Since colors are interchangeable labels, we can apply symmetry breaking to avoid redundancy by swapping all occurrences of color 0 and color 1 throughout a valid coloring gives an equally valid coloring. Without any constraint, the search tree explores all these equivalent relabelings and wastes time on duplicate work. The fix is to require colors to be introduced in order: the first vertex always gets color 0, and a new color k+1 may only be assigned to a vertex if color k has already been used somewhere. This ensures each structurally distinct coloring is explored exactly once.
In our example, vertex 0 must be color 0 (only choice by symmetry breaking). Vertex 1 is adjacent to vertex 0, so it cannot be color 0; by symmetry breaking it can only be color 1 (introducing color 1 for the first time). Vertex 2 is adjacent to both 0 and 1, so it cannot be 0 or 1; symmetry breaking allows color 2 (introducing it for the first time). From here the branching opens up for vertex 3.
The figure below shows the complete search tree. Peach nodes are explored, red nodes are pruned, yellow nodes are complete colorings that are not improvements, and the green node is the first optimal 3-coloring found:
After the first leaf [0,1,2,0,1] establishes best=3, the branches v3=1 and v3=3 are immediately pruned: assigning color 1 to vertex 3 keeps max_color=2 (3 colors) which equals best and cannot improve it, and assigning color 3 to vertex 3 introduces a 4th color which is worse. Only 8 nodes are evaluated out of a much larger unconstrained tree.
The expected output is:
There are a few important design points worth noting for this example, which also apply generally to parallel search algorithms:
k, this causes up to k! times more work than necessary. The constraint that colors must be introduced in order 0, 1, 2, ... ensures each distinct coloring structure is visited exactly once. For our 3-colorable example this is a 3! = 6x reduction in the number of candidate colorings explored.std::atomic<int> best_colors is checked at the entry of every recursive call. When any worker finds a k-coloring, all other workers immediately see the tighter bound and prune any branch that would need k or more colors. This is the same mechanism as in the TSP example: shared atomic state turns the parallel search into a cooperative effort rather than independent races.color_graph creates its own tg via executor.task_group(), valid because every call runs inside a worker thread. The first valid color is explored on the current worker to avoid task overhead, and remaining valid colors are spawned as tg.silent_async tasks. tg.corun() then cooperatively waits for all spawned branches without blocking the worker thread.colors and forbidden vectors are copied on each recursive call, which is simple and correct but not memory-efficient for large graphs. For graphs with hundreds of vertices, a bitmask representation of the color assignment and a single stack-allocated forbidden set would significantly reduce allocation overhead.