Release Notes » Release 4.0.0 (2026/01/01)

Release Summary

Happy new year! Starting in v4, Taskflow migrates the codebase to the C++20 standard for improved runtime performance and engineering efficiency. This release enhances scheduling efficiency through bulk scheduling, reduced atomic overhead, and a streamlined executor design.It also introduces new features such as tf::TaskGroup to better support recursive task parallelism, along with Taskflow Academy, a new video suite for learning Taskflow.

Download

To download the newest version of Taskflow, please clone the master branch from Taskflow's GitHub.

System Requirements

To use Taskflow v4.0.0, you need a compiler that supports C++20:

  • GNU C++ Compiler at least v11.0 with -std=c++20
  • Clang C++ Compiler at least v12.0 with -std=c++20
  • Microsoft Visual Studio at least v19.29 (VS 2019) with /std:c++20
  • Apple Clang (Xcode) at least v13.0 with -std=c++20
  • NVIDIA CUDA Toolkit and Compiler (nvcc) at least v12.0 with host compiler supporting C++20
  • Intel oneAPI DPC++/C++ Compiler at least v2022.0 with -std=c++20

New Features

Taskflow Core

  • replaced std::function in topology with virtual dispatch for better performance
  • replaced TF_LIKELY/_UNLIKELY with C++20 attributes
  • extended the work-stealing queue to support the following new features:
    • non-pointer type with std::optional
    • bulk insertion to reduce the overhead of atomic operations
  • enhanced the executor performance with the following changes:
    • bulk scheduling to reduce the overhead of atomic operations
    • removed executor pointer from worker
    • removed redundant executor pointer validation from the hot scheduling path
  • adopted tf::NonblockingNotifier as the default notifier due to its robustness
    • split notify_one and notify_all to two separate, optimized implementation
    • added notify_n to improve the performance
  • split essential members of tf::Node into tf::NodeBase to improve modularity
  • split anchor nodes to implicit and explicit types to improve exception handling
  • added unit test for tf::NonblockingNotifier
  • added embarrassing parallelism benchmark
  • added merge sort benchmark
  • added tf::TaskGroup to enable efficient implementation of recursive parallelism
  • revised the following benchmarks with tf::TaskGroup for recursive parallelism
    • integrate
    • skynet
    • fibonacci
    • nqueens
  • enhanced the scheduling performance of dependent-async tasks with state merging

Utilities

  • removed ctz as it is duplicate with the C++20 bit manipulation library

Bug Fixes

  • fixed the compiler warning in Windows MSVC W4 (#737)
  • fixed the compiler warning in NVCC due to auto function argument (#746)
  • fixed the regression problem caused by notifier (#746)
  • fixed the bug of uncaught exception from silent_async within corun (#748)

Breaking Changes

Starting in v4, Taskflow will adopt the C++20 standard for better runtime performance and engineering efficiency. If your project does not support C++20, please use Taskflow v3.11.0, which can be downloaded here.

Documentation

Miscellaneous Items

If you are interested in collaborating with us on applying Taskflow to your projects, please feel free to reach out to Dr. Tsung-Wei Huang!