Release Notes » Release 3.10.0 (Master)

Taskflow 3.10.0 is the newest developing line to new features and improvements we continue to support. It is also where this documentation is generated. Many things are considered experimental and may change or break from time to time. While it may be difficult to be keep all things consistent when introducing new features, we continue to try our best to ensure backward compatibility.

Download

To download the newest version of Taskflow, please clone the master branch from Taskflow's GitHub.

System Requirements

To use Taskflow v3.10.0, you need a compiler that supports C++17:

  • GNU C++ Compiler at least v8.4 with -std=c++17
  • Clang C++ Compiler at least v6.0 with -std=c++17
  • Microsoft Visual Studio at least v19.27 with /std:c++17
  • AppleClang Xcode Version at least v12.0 with -std=c++17
  • Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17
  • Intel C++ Compiler at least v19.0.1 with -std=c++17
  • Intel DPC++ Clang Compiler at least v13.0.0 with -std=c++17

Taskflow works on Linux, Windows, and Mac OS X.

Release Summary

This release improves the scheduling performance via a better work-stealing threshold tuning and a constrained decentralized buffer.

New Features

Taskflow Core

  • improved work-stealing loop with an adaptive breaking strategy
  • improved shut-down signal detection using decentralized variables
  • added debug mode for the windows CI to GitHub actions
  • added index range-based parallel-for algorithm (#551)
// initialize data1 and data2 to 10 using two different approaches
std::vector<int> data1(100), data2(100);

// Approach 1: initialize data1 using explicit index range
taskflow.for_each_index(0, 100, 1, [&](int i){ data1[i] = 10; });

// Approach 2: initialize data2 using tf::IndexRange
tf::IndexRange<int> range(0, 100, 1);
taskflow.for_each_by_index(range, [&](tf::IndexRange<int>& subrange){
  for(int i=subrange.begin(); i<subrange.end(); i+=subrange.step_size()) {
    data2[i] = 10;
  }
});
  • added index range-based parallel-reduction algorithm (i#654)
std::vector<double> data(100000);
double res = 1.0;
taskflow.reduce_by_index(
  // index range
  tf::IndexRange<size_t>(0, N, 1),
  // final result
  res,
  // local reducer
  [&](tf::IndexRange<size_t> subrange, std::optional<double> running_total) { 
    double residual = running_total ? *running_total : 0.0;
    for(size_t i=subrange.begin(); i<subrange.end(); i+=subrange.step_size()) {
      data[i] = 1.0;
      residual += data[i];
    }
    printf("partial sum = %lf\n", residual);
    return residual;
  },
  // global reducer
  std::plus<double>()
);
  • added static keyword to the executor creation in taskflow benchmarks

Utilities

Bug Fixes

  • fixed the compilation error of CLI11 due to version incompatibility (#672)
  • fixed the compilation error of template deduction on packaged_task (#657)
  • fixed the MSVC compilation error due to macro clash with std::min and std::max (#670)
  • fixed the runtime error due to the use of latch in tf::Executor::Executor (#667)

Breaking Changes

Currently, there are no breaking changes.

Documentation

Miscellaneous Items

Please do not hesitate to contact Dr. Tsung-Wei Huang if you intend to collaborate with us on using Taskflow in your scientific computing projects.