Release Notes » Release 3.6.0 (2023/05/07)

Taskflow 3.6.0 is the 7th release in the 3.x line! This release includes several new changes, such as dynamic task graph parallelism, improved parallel algorithms, modified GPU tasking interface, documentation, examples, and unit tests.


Taskflow 3.6.0 can be downloaded from here.

System Requirements

To use Taskflow v3.6.0, you need a compiler that supports C++17:

  • GNU C++ Compiler at least v8.4 with -std=c++17
  • Clang C++ Compiler at least v6.0 with -std=c++17
  • Microsoft Visual Studio at least v19.27 with /std:c++17
  • AppleClang Xcode Version at least v12.0 with -std=c++17
  • Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17
  • Intel C++ Compiler at least v19.0.1 with -std=c++17
  • Intel DPC++ Clang Compiler at least v13.0.0 with -std=c++17 and SYCL20

Taskflow works on Linux, Windows, and Mac OS X.

Release Summary

This release contains several changes to largely enhance the programmability of GPU tasking and standard parallel algorithms. More importantly, we have introduced a new dependent asynchronous tasking model that offers great flexibility for expressing dynamic task graph parallelism.

New Features

Taskflow Core



  • Added all_same templates to check if a parameter pack has the same type

Taskflow Profiler (TFProf)

  • Removed cudaFlow and syclFlow tasks

Bug Fixes

If you encounter any potential bugs, please submit an issue at issue tracker.

Breaking Changes

  • Dropped support for cancelling asynchronous tasks
// previous - no longer supported
tf::Future<int> fu = executor.async([](){
  return 1;
std::optional<int> res = fu.get();  // res may be std::nullopt or 1

// now - use std::future instead
std::future<int> fu = executor.async([](){
  return 1;
int res = fu.get();
  • Dropped in-place support for running tf::cudaFlow from a dedicated task
// previous - no longer supported
taskflow.emplace([](tf::cudaFlow& cf){

// now - user to fully control tf::cudaFlow for maximum flexibility
  tf::cudaFlow cf;

  // offload the cudaflow asynchronously through a stream
  tf::cudaStream stream;;

  // wait for the cudaflow completes
// previous - now longer supported
taskflow.emplace([](tf::cudaFlowCapturer& cf){

// now - user to fully control tf::cudaFlowCapturer for maximum flexibility
  tf::cudaFlowCapturer cf;

  // offload the cudaflow asynchronously through a stream
  tf::cudaStream stream;;

  // wait for the cudaflow completes
// previous - no longer supported
tf::cuda_reduce_buffer_size<tf::cudaDefaultExecutionPolicy, int>(N);

// now (and similarly for other parallel algorithms)
tf::cudaDefaultExecutionPolicy policy(stream);
  • Renamed tf::Executor::run_and_wait to tf::Executor::corun for expressiveness
  • Renamed tf::Executor::loop_until to tf::Executor::corun_until for expressiveness
  • Renamed tf::Runtime::run_and_wait to tf::Runtime::corun for expressiveness
  • Disabled argument support for all asynchronous tasking features
    • users are responsible for creating their own wrapper to make the callable
// previous - async allows passing arguments to the callable
executor.async([](int i){ std::cout << i << std::endl; }, 4);  

// now - users are responsible of wrapping the arumgnets into a callable
executor.async([i=4]( std::cout << i << std::endl; ){});
  • Replaced named_async with an overload that takes the name string on the first argument
// previous - explicitly calling named_async to assign a name to an async task
executor.named_async("name", [](){});

// now - overlaod
executor.async("name", [](){});


Miscellaneous Items

We have published Taskflow in the following venues:

Please do not hesitate to contact Dr. Tsung-Wei Huang if you intend to collaborate with us on using Taskflow in your scientific computing projects.