Runtime Tasking
Taskflow allows tasks to interact with the scheduling runtime by taking a runtime object as an argument. This feature is particularly useful for implementing recursive parallel algorithms that create tasks dynamically.
Create a Runtime Task
A runtime task is a special type of task that interacts directly with the Taskflow scheduling runtime through a tf::
tf::Executor executor; tf::Taskflow taskflow; taskflow.emplace([&](tf::Runtime& rt){ // a runtime task assert(&(rt.executor()) == &executor); }); executor.run(taskflow).wait();
Create Asynchronous Tasks from a Runtime Task
Similar to tf::A that spawns 1,000 asynchronous tasks during its execution. These 1,000 asynchronous tasks will be implicitly joined before task A completes, ensuring that all of them finish before the runtime proceeds to task B.
tf::Executor executor(num_threads); tf::Taskflow taskflow; std::atomic<size_t> counter(0); tf::Task A = taskflow.emplace([&](tf::Runtime& rt){ // spawn 1000 asynchronous tasks from this runtime task for(size_t i=0; i<1000; i++) { rt.silent_async([&](){ counter.fetch_add(1, std::memory_order_relaxed); }); } // implicit synchronization at the end of the runtime scope }); tf::Task B = taskflow.emplace([&](){ assert(counter == 1000); }); A.precede(B); executor.run(taskflow).wait();
In addition to tf::
tf::Executor executor; tf::Taskflow taskflow; taskflow.emplace([&, counter=int{0}](tf::Runtime& rt) mutable { tf::AsyncTask prev = rt.silent_dependent_async([&](){ assert(counter == 0); ++counter; }); for(size_t i=1; i<=1000; ++i) { tf::AsyncTask curr = rt.silent_dependent_async([&, i](tf::Runtime&){ assert(counter == i); ++counter; }, prev); prev = curr; } }); // all dependent-async tasks will implicitly join at the runtime's end scope executor.run(taskflow).wait();
Synchronize with Cooperative Execution
The tf::
tf::Executor executor(num_threads); tf::Taskflow taskflow; std::atomic<size_t> counter(0); tf::Task A = taskflow.emplace([&](tf::Runtime& rt){ // spawn 1000 asynchronous tasks from this runtime task for(size_t i=0; i<1000; i++) { rt.silent_async([&](){ counter.fetch_add(1, std::memory_order_relaxed); }); } // explicitly synchronize all asynchronous tasks without blocking the calling worker rt.corun(); assert(counter == 1000); // spawn another 1000 asynchronous tasks from this runtime task for(size_t i=0; i<1000; i++) { rt.silent_async([&](){ counter.fetch_add(1, std::memory_order_relaxed); }); } // implicit synchronization at the end of the runtime scope }); tf::Task B = taskflow.emplace([&](){ assert(counter == 2000); }); A.precede(B); executor.run(taskflow).wait();
In this example, all 100 asynchronous tasks spawned by the runtime are guaranteed to complete before the call to corun returns. This cooperative execution model is especially useful for expressing multi-phase or hierarchical parallel algorithms that require intermediate synchronization within a single runtime task.
Create Recursive Asynchronous Tasks from a Runtime Task
Each asynchronous task can itself be a runtime task, enabling recursive task creation and dynamic parallelism. This model is particularly powerful for implementing divide-and-conquer algorithms, such as parallel sort, graph traversal, and recursion. For instance, the example below demonstrates a parallel recursive implementation of Fibonacci numbers using recursive asynchronous tasking with tf::
#include <taskflow/taskflow.hpp> size_t fibonacci(size_t N, tf::Runtime& rt) { if(N < 2) return N; size_t res1, res2; rt.silent_async([N, &res1](tf::Runtime& rt1){ res1 = fibonacci(N-1, rt1); }); // tail optimization for the right child res2 = fibonacci(N-2, rt); // use corun to avoid blocking the worker from waiting children tasks to finish rt.corun(); return res1 + res2; } int main() { tf::Executor executor; size_t N = 5, res; executor.silent_async([N, &res](tf::Runtime& rt){ res = fibonacci(N, rt); }); executor.wait_for_all(); std::cout << N << "-th Fibonacci number is " << res << '\n'; return 0; }
The figure below shows the execution diagram, where the task with suffix *_1 represents the left child spawned by its parent runtime.
For more details, please refer to Asynchronous Tasking and Fibonacci Number.