Cookbook » Subflow Tasking

It is very common for a parallel program to spawn task dependency graphs at runtime. In Taskflow, we call this subflow tasking.

Create a Subflow

Subflow tasks are those created during the execution of a graph. These tasks are spawned from a parent task and are grouped together to a subflow dependency graph. To create a subflow, emplace a callable that takes an argument of type tf::Subflow. A tf::Subflow object will be created and forwarded to the execution context of the task. All methods you find in tf::Taskflow are applicable for tf::Subflow.

 1: tf::Taskflow taskflow;
 2: tf::Executor executor;
 3:
 4: tf::Task A = taskflow.emplace([] () {}).name("A");  // static task A
 5: tf::Task C = taskflow.emplace([] () {}).name("C");  // static task C
 6: tf::Task D = taskflow.emplace([] () {}).name("D");  // static task D
 7:
 8: tf::Task B = taskflow.emplace([] (tf::Subflow& subflow) { 
 9:   tf::Task B1 = subflow.emplace([] () {}).name("B1");  // subflow task B1
10:   tf::Task B2 = subflow.emplace([] () {}).name("B2");  // subflow task B2
11:   tf::Task B3 = subflow.emplace([] () {}).name("B3");  // subflow task B3
12:   B1.precede(B3);  // B1 runs before B3
13:   B2.precede(B3);  // B2 runs before B3
14: }).name("B");
15:
16: A.precede(B);  // B runs after A
17: A.precede(C);  // C runs after A
18: B.precede(D);  // D runs after B
19: C.precede(D);  // D runs after C
20:
21: executor.run(taskflow).get();  // execute the graph to spawn the subflow
22: taskflow.dump(std::cout);      // dump the taskflow to a DOT format
Taskflow cluster_p0x7ffee9781810 Taskflow cluster_p0x7f9866c01b70 Subflow: B p0x7f9866c01820 A p0x7f9866c01b70 B p0x7f9866c01820->p0x7f9866c01b70 p0x7f9866c01930 C p0x7f9866c01820->p0x7f9866c01930 p0x7f9866c01a40 D p0x7f9866c01b70->p0x7f9866c01a40 p0x7f9866c01930->p0x7f9866c01a40 p0x7f9866d01880 B1 p0x7f9866d01ac0 B3 p0x7f9866d01880->p0x7f9866d01ac0 p0x7f9866d01ac0->p0x7f9866c01b70 p0x7f9866d019a0 B2 p0x7f9866d019a0->p0x7f9866d01ac0

Debrief:

  • Lines 1-2 create a taskflow and an executor
  • Lines 4-6 create three tasks, A, C, and D
  • Lines 8-14 create a task B that spawns a task dependency graph of three tasks B1, B2, and B3
  • Lines 16-19 add dependencies among A, B, C, and D
  • Line 21 submits the graph to an executor and waits until it finishes
  • Line 22 dumps the entire task dependency graph

Lines 8-14 are the main block to enable subflow tasking at task B. The runtime will create a tf::Subflow passing it to task B, and spawn a dependency graph as described by the associated callable. This new subflow graph will be added to the topology of its parent task B. Due to the property of subflow tasking, we cannot dump its structure before execution. We will need to run the graph first to spawn the graph and then call tf::Taskflow::dump.

Join a Subflow

By default, a subflow joins its parent task when the program leaves its execution context. All nodes of zero outgoing edges in the subflow precede its parent task. You can explicitly join a subflow within its execution context to carry out recursive patterns. A famous implementation is fibonacci recursion.

int spawn(int n, tf::Subflow& sbf) {
  if (n < 2) return n;
  int res1, res2;
  sbf.emplace([&res1, n] (tf::Subflow& sbf) { res1 = spawn(n - 1, sbf); } );
  sbf.emplace([&res2, n] (tf::Subflow& sbf) { res2 = spawn(n - 2, sbf); } );
  sbf.join();    // join to materialize the subflow immediately
  return res1 + res2;
}
  
taskflow.emplace([&res] (tf::Subflow& sbf) { 
  res = spawn(5, sbf);  
});

executor.run(taskflow).wait();

The code above computes the fifth fibonacci number using recursive subflow. Calling tf::Subflow::join immediately materializes the subflow by executing all associated tasks to recursively compute fibonacci numbers. The taskflow graph is shown below:

Taskflow cluster_p0x7ffd972c0cd0 Taskflow: fibonacci cluster_p0xa445c0 Subflow: 5 cluster_p0x7fe918000b90 Subflow: 4 cluster_p0x7fe910000b90 Subflow: 3 cluster_p0x7fe918000fe0 Subflow: 2 cluster_p0x7fe910000c48 Subflow: 2 cluster_p0x7fe918000c48 Subflow: 3 cluster_p0x7fe918000d00 Subflow: 2 p0xa445c0 5 p0x7fe918000b90 4 p0x7fe918000b90->p0xa445c0 p0x7fe910000b90 3 p0x7fe910000b90->p0x7fe918000b90 p0x7fe918000fe0 2 p0x7fe918000fe0->p0x7fe910000b90 p0x7fe918001150 1 p0x7fe918001150->p0x7fe918000fe0 p0x7fe918001208 0 p0x7fe918001208->p0x7fe918000fe0 p0x7fe918001098 1 p0x7fe918001098->p0x7fe910000b90 p0x7fe910000c48 2 p0x7fe910000c48->p0x7fe918000b90 p0x7fe910000d00 1 p0x7fe910000d00->p0x7fe910000c48 p0x7fe910000db8 0 p0x7fe910000db8->p0x7fe910000c48 p0x7fe918000c48 3 p0x7fe918000c48->p0xa445c0 p0x7fe918000d00 2 p0x7fe918000d00->p0x7fe918000c48 p0x7fe918000e70 1 p0x7fe918000e70->p0x7fe918000d00 p0x7fe918000f28 0 p0x7fe918000f28->p0x7fe918000d00 p0x7fe918000db8 1 p0x7fe918000db8->p0x7fe918000c48

Our implementation to join subflows is recursive in order to preserve the thread context in each subflow task. Having a deep recursion of subflows may cause stack overflow.

Detach a Subflow

In contract to joined subflow, you can detach a subflow from its parent task, allowing its execution to flow independently.

 1: tf::Taskflow taskflow;
 2:
 3: tf::Task A = taskflow.emplace([] () {}).name("A");  // static task A
 4: tf::Task C = taskflow.emplace([] () {}).name("C");  // static task C
 5: tf::Task D = taskflow.emplace([] () {}).name("D");  // static task D
 6:
 7: tf::Task B = taskflow.emplace([] (tf::Subflow& subflow) { 
 8:   tf::Task B1 = subflow.emplace([] () {}).name("B1");  // static task B1
 9:   tf::Task B2 = subflow.emplace([] () {}).name("B2");  // static task B2
10:   tf::Task B3 = subflow.emplace([] () {}).name("B3");  // static task B3
11:   B1.precede(B3);    // B1 runs before B3
12:   B2.precede(B3);    // B2 runs before B3
13:   subflow.detach();  // detach this subflow
14: }).name("B");
15:
16: A.precede(B);  // B runs after A
17: A.precede(C);  // C runs after A
18: B.precede(D);  // D runs after B
19: C.precede(D);  // D runs after C
20:
21: tf::Executor executor;
22: executor.run(taskflow).wait();       // execute the graph to spawn the subflow
22: taskflow.dump(std::cout);            // dump the taskflow to DOT format

The figure below demonstrates a detached subflow based on the previous example. A detached subflow will eventually join the topology of its parent task.

Taskflow cluster_p0x7ffeecc59810 Taskflow cluster_p0x7fdc0dc02b60 Subflow: B p0x7fdc0dc02830 A p0x7fdc0dc02b60 B p0x7fdc0dc02830->p0x7fdc0dc02b60 p0x7fdc0dc02940 C p0x7fdc0dc02830->p0x7fdc0dc02940 p0x7fdc0dc02a50 D p0x7fdc0dc02b60->p0x7fdc0dc02a50 p0x7fdc0dc02940->p0x7fdc0dc02a50 p0x7fdc0de00120 B1 p0x7fdc0de00360 B3 p0x7fdc0de00120->p0x7fdc0de00360 p0x7fdc0de00240 B2 p0x7fdc0de00240->p0x7fdc0de00360

Detached subflow becomes an independent graph attached to the top-most taskflow. Running a taskflow multiple times will accumulate all detached tasks in the graph. For example, running the above taskflow 5 times results in a total of 19 tasks.

executor.run_n(taskflow, 5).wait();
assert(taskflow.num_tasks() == 19);
taskflow.dump(std::cout);

The dumped graph is shown as follows:

Taskflow p0x934ff0 A p0x935218 B p0x934ff0->p0x935218 p0x9350a8 C p0x934ff0->p0x9350a8 p0x935160 D p0x935218->p0x935160 p0x9350a8->p0x935160 p0x7fd564000b90 B1 p0x7fd564000d00 B3 p0x7fd564000b90->p0x7fd564000d00 p0x7fd564000c48 B2 p0x7fd564000c48->p0x7fd564000d00 p0x7fd55c000b90 B1 p0x7fd55c000d00 B3 p0x7fd55c000b90->p0x7fd55c000d00 p0x7fd55c000c48 B2 p0x7fd55c000c48->p0x7fd55c000d00 p0x7fd55c000db8 B1 p0x7fd55c000f28 B3 p0x7fd55c000db8->p0x7fd55c000f28 p0x7fd55c000e70 B2 p0x7fd55c000e70->p0x7fd55c000f28 p0x7fd55c000fe0 B1 p0x7fd55c001150 B3 p0x7fd55c000fe0->p0x7fd55c001150 p0x7fd55c001098 B2 p0x7fd55c001098->p0x7fd55c001150 p0x7fd55c001208 B1 p0x7fd55c001378 B3 p0x7fd55c001208->p0x7fd55c001378 p0x7fd55c0012c0 B2 p0x7fd55c0012c0->p0x7fd55c001378

Create a Nested Subflow

A subflow can be nested or recursive. You can create another subflow from the execution of a subflow and so on.

 1: tf::Taskflow taskflow;
 2:
 3: tf::Task A = taskflow.emplace([] (tf::Subflow& sbf){
 4:   std::cout << "A spawns A1 & subflow A2\n";
 5:   tf::Task A1 = sbf.emplace([] () {
 6:     std::cout << "subtask A1\n";
 7:   }).name("A1");
 8:
 9:   tf::Task A2 = sbf.emplace([] (tf::Subflow& sbf2){
10:     std::cout << "A2 spawns A2_1 & A2_2\n";
11:     tf::Task A2_1 = sbf2.emplace([] () {
12:       std::cout << "subtask A2_1\n";
13:     }).name("A2_1");
14:     tf::Task A2_2 = sbf2.emplace([] () {
15:       std::cout << "subtask A2_2\n";
16:     }).name("A2_2");
17:     A2_1.precede(A2_2);
18:   }).name("A2");
19:   A1.precede(A2);
20: }).name("A");
21:
22: // execute the graph to spawn the subflow
23: tf::Executor().run(taskflow).get();
24: taskflow.dump(std::cout);
Taskflow cluster_p0x7ffeeca03810 Taskflow cluster_p0x7fbc40c02830 Subflow: A cluster_p0x7fbc40d00240 Subflow: A2 p0x7fbc40c02830 A p0x7fbc40d00120 A1 p0x7fbc40d00240 A2 p0x7fbc40d00120->p0x7fbc40d00240 p0x7fbc40d00240->p0x7fbc40c02830 p0x7fbc40d00360 A2_1 p0x7fbc40d00470 A2_2 p0x7fbc40d00360->p0x7fbc40d00470 p0x7fbc40d00470->p0x7fbc40d00240

Debrief:

  • Line 1 creates a taskflow object
  • Lines 3-20 create a task to spawn a subflow of two tasks A1 and A2
  • Lines 9-18 spawn another subflow of two tasks A2_1 and A2_2 out of its parent task A2
  • Lines 23-24 runs the graph asynchronously and dump its structure when it finishes

Similarly, you can detach a nested subflow from its parent subflow. A detached subflow will run independently and eventually join the topology of its parent subflow.